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ABSTRACT 


Automatic  image  pattern  recognition  techniques  have  been  successfully  applied  to 
improving  productivity  and  quality  in  both  manufacturing  and  service  applications. 

Automatic  Image  Pattern  Recognition  Algorithms  are  often  developed  and  tested  using 
unique  data  bases  for  each  specific  application.    Quantitative  comparison  of  different 
approaches  and  extrapolation  of  existing  techniques  to  new  applications  is  difficult 
or  impossible. 

To  facilitate  data  interchange  in  this  area  a  two  day  workshop  was  held  at  the  National 
Bureau  of  Standards  in  Gaithersburg,  Maryland  on  June  3  and  4,  1976. 

The  workshop  considered  the  issues  involved  with  interchange  of  images  as  data  in 
standard  formats  on  magnetic  tape.    Specifically,  the  workshop  addressed  the  following 
objectives: 

1.  To  define  mechanisms  for  achieving  a  standard  format  for  magnetic  tape  interchange. 

2.  To  define  requirements  for  documentation  of  the  recording  environment  of  an  image. 

3.  To  recommend  mechanisms  for  selecting  and  distributing  prototype  images. 

4.  To  consider  the  requirements  and  to  explore  the  prospect  for  a  language  to 
describe  image  content  and  structure. 


KEY  WORDS:    Automation;  calibration;  data  formats;  documentation;  image  content  language; 
image  processing;  pattern  recognition;  prototype  images;  standards. 
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INTRODUCTION 


Automatic  image  pattern  recognition  techniques  have  been  successfully  applied  to 
improving  productivity  and  quality  in  both  manufacturing  and  service  applications. 

Automatic  Image  Pattern  Recognition  Algorithms  are  often  developed  and  tested  using 
unique  data  bases  for  each  specific  application.    Quantitative  comparison  of  different 
approaches  and  extrapolation  of  existing  techniques  to  new  applications  is  difficult 
or  impossible. 

At  the  suggestion  of  the  Electronic  Industries  Association  (EIA)  and  with  the  support 
and  cooperation  of  the  IEEE  Computer  Society  Machine  Pattern  Recognition  Group  and  the 
Association  for  Computing  Machinery  (ACM)  Special  Interest  Group  on  Graphics  the  National 
Bureau  of  Standards  conducted  a  two  day  workshop  on  problems  concerned  with  the  adoption 
of  the  standards  for  the  interchange  of  image  pattern  recognition  data.    The  interests 
of  the  various  collaborating  groups  derived  from  concerns  in  the  industry  for  advancing 
the  state  of  development  in  the  pattern  recognition  and  image  processing  art,  concerns 
in  the  engineering  community  for  the  adoption  of  standards  for  testing  equipment  and 
concerns  in  the  computing  community  for  the  adoption  of  standards  relating  to  the  testing 
of  algorithms. 

A  steering  committee  consisting  of  the  chairman  of  the  various  sessions  of  the  workshop 
developed,  over  a  period  of  several  months,  a  program  which  would  deal  with  several 
issues  that  seem  central  in  various  ways  to  the  question  of  adoption  of  standards  for 
image  interchange.    The  members  of  this  group  were: 

William  Alford,  National  Aeronautics  and  Space  Administration 

John  Dehne,  US  Army  Night  Vision  Lab,  representing  NATO 

John  M.  Evans,  Jr.,  National  Bureau  of  Standards 

Russell  Kirsch,  National  Bureau  of  Standards 

James  B.  McFerran,  Sperry  Univac,  representing  IEEE 

Roger  N.  Nagel ,  National  Institutes  of  Health,  representing  EIA 

Theo.  Pavlidis,  Princeton  University,  representing  IEEE 

Judith  M.  S.  Prewitt,  National  Institutes  of  Health,  representing  IEEE 

Azriel  Rosenfeld,  University  of  Maryland,  representing  the  Journal  Computer  Graphics 

and  Image  Processing 
Gene  Thorley,  US  Geological  Survey  ERDS  Program 

There  were  four  distinct  problem  areas  the  workshop  addressed  itself  to: 

1.  The  adoption  of  standards  for  magnetic  tape  formats. 

2.  The  documentation  of  the  recording  environment  for  image  scanning  and  transducing. 

3.  The  acceptance  of  prototype  image  data  bases. 

4.  The  description  of  image  content  and  structure. 

The  first  and  most  clearly  necessary  problem  for  concern  in  the  workshop  was  the  question 
of  agreeing  upon  formats  for  magnetic  tape  recording.    Clearly  any  form  of  interchange 
among  research  and  development  workers  would  be  advanced  by  the  existence  of  an  agreed 
upon  format  or  set  of  formats  for  recording  images  when  they  are  interchanged  among 
groups  and  between  those  producing  image  data  and  those  using  data  for  pattern  recognition 
and  image  processing  purposes.    This  was  therefore  the  topic  of  the  first  session  of 
the  workshop. 

The  second  session  arose  from  a  concern  particularly  in  the  academic  research  community 
for  documentation  of  the  recording  conditions  under  which  images  are  produced.    In  many 
cases  images  produced  with  specialized  equipment,  (particularly  transducers  and  devices 
which  are  peculiar  to  a  particular  environment)  are  exchanged  between  the  laboratory 
producing  the  data  and  other  laboratories  having  no  familiarities  with  the  recording 
equipment.    Much  valuable  information  is  lost  in  the  case  where  the  characteristics 
of  the  recording  equipment  and  the  whole  environment  are  not  documented.    It  was  in  an 
attempt  to  heighten  sensitivity  to  the  need  for  documenting  the  recording  environment 
conditions  that  this  second  session  of  the  workshop  was  conducted. 
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The  third  session  addressed  itself  to  the  production  of  prototype  data  bases.  There 
are  at  least  two  clear  needs  for  such  prototypes  of  data  bases.  The  first  is  the  analog 
of  the  problem  in  testing  ordinary  optical  instrumentation  where  test  patterns  are  used 
for  specifying  such  properties  as  resolution.  For  testing  various  types  of  devices  the 
need  for  prototype  data  bases  is  fairly  evident.  So  are  prototype  data  bases  necessary 
for  testing  algorithms  since  the  useful  intercomparison  of  algorithms  for  doing  pattern 
recognition  could  be  enhanced  if  agreed  upon  prototype  data  bases  could  be  interchanged 
for  testing  these  algorithms. 

The  fourth  session  of  the  workshop  concerned  with  describing  image  content  and  structure 
arose  from  a  need  expressed  occasionally  in  the  academic  community  where  images  of  a 
very  specialized  sort  are  recorded.    Typically  these  images  are  produced  in  laboratories 
where  specialized  talents  exist  for  interpreting  those  images.    Obvious  examples  occur 
in  the  biomedical  community.    When  these  images  are  interchanged  between  the  producers, 
typically  biomedical  scientists,  and  the  computer  scientists  attempting  to  do  pattern 
recognition  research,  there  is  often  a  failure  to  communicate,  along  with  the  images, 
the  suitable  descriptions  of  the  articulated  structure  sufficient  to  enable  the  computer 
scientists  to  understand  what  the  content  and  structural  descriptions  of  these  images 
are.    It  was  in  an  attempt  to  investigate  the  possibility  of  compiling  such  structural 
descriptions  that  this  last  section  of  the  workshop  was  conducted. 

The  reader  of  these  workshop  proceedings  will  easily  note  that  none  of  these  four 
areas  are  yet  advanced  to  the  stage  where  consensus  exists,  despite  the  widespread 
recognition  of  the  need  for  such  a  consensus.    He  will  also  note  equally  clearly  that 
the  different  areas  of  concern  are  in  different  stages  of  development.    Perhaps  this 
differential  status  of  the  different  areas  can  serve  as  a  source  of  motivation  for  increasing 
activities  in  those  areas  which  are  more  backward  and  for  encouraging  the  continued 
pace  in  those  areas  which  have  already  showed  noteworthy  progress. 

John  M.  Evans,  Jr. 
Acting  Manager 

Office  of  Developmental  Automation 
and  Control  Technology 

Institute  for  Computer  Sciences 
and  Technology 

National  Bureau  of  Standards 

Roger  N.  Nagel 
Senior  Staff  Fellow 

National  Institute  for  Dental  Research 
National  Institutes  of  Health 

Russell  Kirsch 

Head,  Artificial  Intelligence  Research 
Applied  Mathematics  Division 
Institute  for  Basic  Standards 
National  Bureau  of  Standards 
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Introductory  Comments 
at  the 

Workshop  on  Standards  for  Image  Pattern  Recognition 
June  3,  1976 

by 

Dr.  Ruth  M.  Davis 
Director,  Institute  for 
Computer  Sciences  and  Technology 


I  am  delighted  to  welcome  you  to  the  National  Bureau  of  Standards,  and  particularly 
to  this  landmark  Workshop  on  Standards  for  Image  Pattern  Recognition. 

Your  meeting  here  -  to  explore  some  of  the  issues  in  standards  development  for  image 
pattern  recognition  -  is  a  most  appropriate  activity  for  the  National  Bureau  of  Standards. 

NBS  has  long  been  involved  in  standards  development  and  improvement,  the  interchange 
of  information  and  the  application  of  technology  for  public  benefit. 

This  year  the  Bureau  is  celebrating  its  75th  anniversary  as  the  Nation's  physical 
science  and  measurement  laboratory.    Throughout  its  history  NBS  has  provided  the  basis  for 
the  Nation's  measurement  standards.    In  addition  to  a  traditional  role  in  the  area  of 
weights  and  measures,  NBS  has  long  been  involved  in  product  testing  and  research  on  test 
methods  and  specifications. 

NBS  has  been  at  the  forefront  of  new  technologies  in  order  to  provide  the  foundation 
for  science  and  commerce  and  to  effectively  serve  government  and  the  public.  Computer 
technology  is  one  of  those  areas  in  which  NBS  has  been  at  the  forefront.    In  the  1940' s 
and  1950' s  Bureau  scientists  designed  and  developed  SEAC,  the  first  general  purpose,  stored 
program  computer  operated  in  the  U.S. 

Continuing  technological  developments  in  ensuing  years  coupled  with  growing  Federal 
Government  dependence  on  computers  for  information  processing  led  to  legislation  in  1965 
to  establish  Department  of  Commerce  responsibility  for  improving  the  utilization  of 
computers  by  Federal  agencies. 

By  that  legislation,  the  Brooks  Act,  NBS  is  directed  to  set  standards  for  Federal 
procurement  and  use  of  computers,  to  advise  other  agencies  on  the  efficient  use  of  computers 
and  to  carry  out  research  and  development  in  computer  science  and  technology. 
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One  area  in  which  we  carry  out  these  functions  is  the  application  of  computers  in 
automation  technology.    The  particular  aspect  you  will  be  discussing  in  this  Workshop, 
automatic  image  pattern  recognition,  is  especially  relevant  to  growing  concerns  about 
productivity  and  quality  of  goods  and  services  in  the  U.S.  today. 

There  have  been  dramatic  shifts  in  the  make-up  of  the  labor  force  in  the  U.S.  over 
the  past  few  decades.    Between  1960  and  1970  the  increase  of  workers  in  the  service 
industries  was  about  12  times  the  increase  of  workers  in  the  goods  producing  industries. 

Today  about  2/3  of  the  labor  force  is  employed  in  the  Service  Sector  which  includes 
Federal,  state  and  local  government  employees  and  people  engaged  in  services  such  as  health 
care,  wholesale  and  retail  trade,  services  and  financial  and  legal  work. 

The  rise  in  importance  of  the  Service  Sector  has  been  accompanied  by  some  disturbing 
problems: 

Productivity  in  the  Service  Sector  has  been  lagging.    The  annual  growth  rate  of  labor 
productivity  in  the  Service  Sector  has  been  significantly  lower  than  the  growth  rate  in 
the  goods  sector:  2.7%  vs.  3.4%  between  1960  and  1970. 

Secretary  of  Commerce  Elliott  Richardson  is  especially  interested  in  this  aspect  of 
services.    He  says  that  available  technology  to  improve  productivity  has  not  been  adequately 
applied,  and  he  cites  health  care  specifically  as  a  service  which  he  believes  can  be  improved 
by  the  application  of  technology. 

Services  have  contributed  to  increased  costs.    As  an  example,  the  costs  of  health 
care  have  increased  faster  than  any  other  component  of  the  consumer  price  index  during  the 
past  two  decades.    Between  1950  and  1974  medical  costs  increased  by  180%  while  the  CPI 
rose  by  105%,  causing  health  care  costs  to  increase  from  4.6%  to  7.6%  of  the  GNP. 

Salary  increases  in  the  Service  Sector  have  outpaced  increases  in  the  goods  sector. 
In  the  years  from  1953-74,  pay  increases  in  manufacturing  were  141%,  while  increases  for 
State  and  local  government  employees  were  188%  and  for  other  services  171%. 

There  is  widespread  dissatisfaction  with  the  quality  of  services.    A  report  on 
Automation  Opportunities  in  the  Service  Sector  for  the  Federal  Council  for  Science  and 
Technology  found  that  about  87%  of  the  consumer  complaints  received  by  Better  Business 
Bureaus  and  the  Office  of  Consumer  Affairs  were  directed  toward  the  service  industries 
with  66%  directed  toward  services  actually  bought  in  the  marketplace. 


Automation  technology  can  play  a  key  role  in  attacking  the  problems  of  the  Service 
Sector  by  improving  productivity  and  quality  of  services. 

The  application  of  automation  in  agriculture  has  greatly  improved  productivity.  As  a 
result  we  spend  the  smallest  percentage  of  our  annual  income  for  food  of  any  nation  in  the 
world,  and,  in  addition,  we  provide  for  most  of  the  food  aid  shipments  in  the  world. 

Automation  in  manufacturing  is  widespread  with  applications  of  assembly-line 
automobiles  and  appliances.    Numerically  controlled  tools,  industrial  robots  and  computer- 
aided  manufacturing  techniques  are  now  beginning  to  be  used  in  the  rest  of  manufacturing 
industry,  with  potential  increases  in  productivity  running  to  thousands  of  percents. 

The  principal  experiences  with  automation  in  the  Service  Sector  have  been  in  the 
applications  of  computers  and  the  mechanization  of  paper  handling.    The  capability  of 
computers  to  process  information  quickly  and  accurately  is  utilized  by  service  industries 
such  as  banking,  credit,  financial  and  insurance.    Most  large  organizations  use  computers 
for  payroll  calculations,  health  and  employee  records  and  financial  and  inventory  records. 
Real-time  services  such  as  air  travel  are  dependent  upon  computers.    There  have  also  been 
successful  applications  of  automation  processes  for  automated  bank  tellers,  garbage 
collection,  automobile  diagnosis,  automated  warehouses,  vending  machines,  direct  distance 
dialing  and  computer-assisted  instruction. 

j        Automatic  image  pattern  recognition  is  one  aspect  of  automation  with  particular 

I 

j  significance  for  improving  productivity  and  improving  quality  in  the  Service  Sector. 

j         Image  pattern  recognition  technology  has  been  successfully  used  to  automate  finger- 

I  print  identification,  weather  prediction,  photographic  interpretation ,  and  molecular  and 
■cellular  pattern  analysis. 

Automation,  via  image  pattern  recognition,  has  the  potential  for  improving 
productivity  and  services  in  many  additional  areas  such  as: 

°    automation  of  analyzing  x-rays  and  cytology:  applications  in  the  health  field 
°    space  applications  and  resource  discovery  and  management 
:        °    safety  systems  for  public  transportation  systems 

ij        °    automated  mail  and  parcel  post  recognition  and  handling  systems  in  the  post  office 

II  °    automated  maintenance  and  repair  systems  for  consumer  services 
°    automated  security  systems 

;        °    inspection  and  quality  control  for  manufacturing  industries 
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However,  the  barriers  to  the  successful  diffusion  of  automation  to  these  and  other 
service  and  manufacturing  applications  are  many.    Research  and  development  in  automation 
technologies  are  fragmented  and  the  lack  of  standards  hampers  the  transferability  of 
existing  technology. 

Very  often,  applications  are  developed  using  unique  data  bases,  making  transfer  of 
existing  techniques  to  new  applications  difficult  or  impossible. 

The  development  and  diffusion  of  image  pattern  recognition  technology  requires  a 
coherent  technical  foundation  or  infrastructure  that  allows  people  to  communicate  with 
others  and  to  efficiently  utilize  available  technology.    That  is  why  this  Workshop  is  so 
important.    We  are  hopeful  that  additional  attention  to  standards  for  image  pattern 
recognition  will  advance  research  and  development  in  this  field  by  improving  communication 
and  data  interchange.    This,  in  turn,  will  help  government  agencies  and  private  sector 
organizations  to  improve  productivity  and  service  quality  by  applying  this  technology  more 
rapidly  and  more  efficiently. 

I  wish  you  success  in  your  Workshop.    I  hope  that  you  enjoy  this  visit  to  the 
National  Bureau  of  Standards  and  that  you  will  come  back  again. 
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Discussion 

The  general  consensus  of  the  attendees  suoported  defacto  adoption  of  a  limited  number 
(1-5)  tape  format  standards.    This  was  evident  both  in  the  general  discussion  following 
the  presentations  and  in  the  resiolts  of  a  survey  questionnaire .  There  was,  however,  con- 
siderably less  unanimity  as  to  the  detailed  design  of  such  formats. 

One  major  area  of  disagreement  centered  around  the  issue  of  whether  it  is  practical 
to  have  one  standard  tape  format  or  not.    Those  favoring  the  adootion  of  a  single  standard 
(one  third  of  questionnaire  returned)  generally  favored  choosing  a  self-documenting 
meta-standard  with  sufficient  flexibility  for  a  broad  variety  of  applications.    A  majority 
of  this  grouD  (according  to  the  questionnaires )  ranked  the  NATO  standard  most  highly  and 
felt  that  such  a  standard  oould  be  adopted  within  their  ovm  organization  in  one  year  or 
less.    Those  favoring  more  than  one  standard  generallv  anticipated  greater  time  delays 
(2-3  years)  before  a  standard  could  be  adopted  within  their  own  organization. 

A  key  issue  within  this  group  is  the  desire  for  a  format  which  can  be  read  and  written 
by  ANSI   FORTRAN  programs.    "This  is  important  to  users  vtio  lack  svstems  prcxjramtdng 
experience  or  support  and  tends  to  place  several  restrictions  on  format  design  as  follows: 

1)  There  is  great  resistance  to  incorporation  of  the  self  documenting 
header  records  needed  for  a  me ta- format  in  the  same  file  as  the 
image.    These  users  orefer  the  use  of  a  header  file  (which 
generally  reduces  the  data  integrity  of  the  meta-format  anproach) 
so  that  available  system  functions  may  be  used  to  skip  it  -  as 
opposed  to  the  user  program  functions  needed  to  by-pass  similar 
header  records. 

2)  Single  file  (image  sans  header)  tapes  are  preferred  by  some  in 
this  group  due  to  system/programmer  difficulties  with  multi-file 
tapes. 

3)  Fixed  record  length  is  required  to  insure  that  tape  blocking 
can  be  oontrolled  at  a  JCL  level  on  all  operating  systems.  In 
addition,  the  length  should  be  small  (2K-4K  bytes)  to  allov? 
buffering  on  minicomputer  systons  and  should  probablv  be  an 
integral  multiple  of  some  small  number  (128-256)  to  allow  use 
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of  long  records  via  format  buffered  I/O  using  system  logical/ 
physical  record  blocking  control.    Use  of  a  fixed  record 
length  on  all  files  in  the  same  tape  is  desirable,  but  not 
necessary. 

4)    This  group  shows  strong  preference  for  simnle  formats  vdiich 
could  be  used  for  processing  within  an  installation  as  well 
as  for  transmittal  between  installations.    Thus  use  of  any 
line  by  line  ancillary  data  (line  numbers,  non-video  cali- 
bration levels,  etc)  is  disliked.    This  group  generally 
favors  the  use  of  formatted  (alpha-nijmeric  ooded)  or  binary 
vjord  length  data  over  other  schanes  for  coding  the  picture  element 
values  -  though  differences  in  machine  codes  and  vrord  lengths 
makes  these  approaches  difficult  as  well.    The  hooe  here  is 
to  use  I,  E,  F,  or  A  formats. 
With  all  these  factors  it  was  generally  concluded  that  it  might  be  possible  to 
generate  one  or  at  most  two  candidate  formats  v*iich  could  then  be  published  for  critical 
review  and  v±iich  would  form  the  basis  for  any  further  standardization  effort.    To  this  end 
interested  parties  representing  the  entire  spectrum  of  opinions  reconvened  about  one  week 
later  to  attempt  to  formulate  the  candidate  formats.    The  simple  format  was  designed  first 
and  may  be  described  as  follows: 

The  Simple  Format 

1.  No  headers  or  other  ancilliairy  information  on  the  tape.    All  such  data  is  transmitted 
as  separate  written  documentation  which  is  sent  with  the  tape. 

2.  Each  file  contains  one  picture.    Each  picture  is  a  regular  two  dimensional  array  of 
single  valued  inteaers  (i.e.  only  1    spectral  band  per  picture) . 

3.  Each  image  line  is  one  record  and  all  records  in  a  given  file  are  the  same  length. 
The  maximum  record  length  is  4096  bytes.     (1  byte  =  8  bits) . 

4.  Each  picture  element  (pixel)  is  encoded  as  a  positive  integer  in  one  byte.  Picture 
elements  within  a  line  are  stored  sequentially  with  a  record. 

5.  Multi-file  (multi -picture) tapes    are  permitted.    Record  lengths  may  vary  between  two 
files  on  the  same  tape.    Files  (pictures)  are  separated  by  one  EOF  mark.    The  last  file 
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is  followed  by  two  or  irore  EOF  marks. 

There  was  much  discussion  on  the  topic  of  a  meta-format  and  an  attempt  to  design  one. 
A  major  conceptual  difficulty  raised  was  the  desirability  of  having  yet  another  proposed 
meta-format,  especially  one  proposal  bv  a  group  not  actively  involved  in  generating  or 
exchanging  large  data  bases.    It  was  generally  concluded  that  such  an  attempt  might  not  be 
wDrth#iile,  especially  in  light  of  continuing  internationally  coordinated  activities  by 
NATO  and  NASA,  unless  it  resulted  in  a  format  near  eno\agh  to  one  or  more  of  the  existing 
ones  to  cause  an  acceptable  design  impact.    Candidate  meta-formats  include  the  MATO  format, 
the  NASA  VICAR  format,  and  the  NASA  nroposed  nev?  CCT  format  (see  SOS  naper) .    Each  has 
its  strong  and  weak  points. 

The  NATO  format  has  been  internationally  coordinated  and  is  the  only  one  that  makes 
provision  for  transmittal  of  non- imagery  data,  real  and  conplex  valued  imagery,  and  negative 
value  image  points.    It  is  a  very  good  exanple  of  a  meta-format,  especially  in  light  of 
the  fact  that  image  format  and  tape  format  are  ccnpletely  decoupled  and  treated  as  separate 
issues.    This  allows  great  flexibility  including  very  long  image  lines  with  simultaneous 
limits  on  maximum  tape  buffer  length.    However,  this  format  places  limits  on  ancilliary 
data  (none  in  an  image  line  and  onlv  4K  max.  characters  in  free  format  per  image  header) 
v^ich  severely  limit  its  utility  in  NASA  type  data  bases. 

The  NASA  VICAR  format  is  currently  the  only  one  in  operational  use  and  allows 
unlimited  amoijnts  of  ancilliary  data  per  image.    However,  it  has  only  been  used  to  date 
as  a  processing  and  storage  (not  transmittal)  mechanism  and  is  currently  quite  machine 
dependent  in  pixel  storage  codes.    Further  it  is  not  designed  for  use  with  non- image  data 
(feature  vectors,  etc)  and  retains  only  limited  flexibility  for  distinctions  between  tape 
and  image  formats. 

The  proposed  new  NASA  CCT  format  can  be  expected  to  be  very  heavily  used  and  widely 
dispersed  owing  to  large  scale  distribution  of  the  tapes  by  NASA.    It  is  currently  the 
least  settled  (hence  easiest  to  impact)  of  the  three.    As  currently  proposed  it  would 
allow  maximal  use  of  ancilliary  data,  but  would  only  contain  provision  for  transmittal  of 
positive  integer,  imagery  type  data.    A  relatively  inflexible  relationship  between  tape 
and  image  format  is  proposed  vd-iich  would  make  use  of  smaller  tape  buffers  more  difficult. 
Furthermore,  most  applications  do  not  require  such  extensive  ancilliary  data  (line  by  line) 


and  many  users  are  quite  concerned  about  finding  simple  ways  to  eliminate  it  frcm  the  tapes 
they  use. 

After  lengthy  discussion  centering  around  a  varient  of  the  VICAR  format  it  was 
generally  concluded  that  design  of  a  separate  new  meta- format  was  not  reasonable  in  light 
of  the  \ancertainties  of  acceptance  bv  the  current  major  users  of  such  formats.    As  a 
result,  no  meta-format  was  drafted.    However,  it  is  hoped  that  the  oonments  about  existing 
meta- formats  may  help  to  guide  their  further  developments. 
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The  NATO  RSG-4/SGIP  TAPE  FORMAT 
By 

John  S .  Dehne 
US  Army  Night  Vision  Laboratory 
Acting  US  Project  Officer,  NATO  RSG-4/SGIP 


Under  the  NATO  Defense  Research  Group  (DRG)  cognizance  of  and  cooperation  in  the 
field  of  Pattern  Recognition  is  covered  by  AC/243  (Panel  III)  Research  Study  Group  4. 
In  recent  years,  this  group  (RSG-4)  has  been  very  active  in  assessing  various  military 
application  areas  for  pattern  recognition  and  in  fostering  cooperation  and  coordination 
among  participating  governments  in  each  of  these  areas.     Dr.  David  Hodge  of  the  US  Army 
Human  Engineering  Laboratory  serves  as  the  US  Delegate  to  RSG-4. 

The  first  area  which  RSG-4  assessed  was  Image  Processing.     A  substantial  interest 
for  cooperative  efforts  in  this  area  was  discovered  among  all  participants.     As  a  result 
a  Subgroup  on  Image  Processing  (SGIP)  has  been  formed  to  plan  and  coordinate  cooperative 
projects.     This  group  consists  of  a  Project  Officer  from  each  participating  country. 
Mr.  John  S.  Dehne  of  the  US  Army  Night  Vision  Laboratory  is  currently  the  Acting  US 
Project  Officer. 

It  was  immediately  clear  to  all  participants  that  cooperative  efforts  would  depend 

on  ready  interchange  of  image  data  bases  and  algorithms.     This  interchange  was  hampered 

by  the  fact  that  each  installation  had  generated  its  own  image  processing  software  system 

specifically  for  its  own  computer  facilities  which  differed  enormously.     Thus,  each 

installation  had  developed  one  or  more  unique  tape  formats  and  based  a  large  software 

and  data  base  investment  on  the  use  of  that  format.     Because  of  this,  adoption  of  a 

single  format  for  use  in  all  installations  would  have  been  quite  expensive.  Instead, 

it  was  decided  to  develop  a  tape  format  to  be  used  only  for  transferring  digital  imagery 

from  one  installation  to  another.     Each  installation  need  write  only  two  simple  programs 

to  begin  exchanging  imagery  -  one  to  translate  between  the  transfer  format  and  the  format 
and  the  format  of  that  specific  installation  and  another  to  do  the  reverse. 
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The  choice  of  this  approach,  development  of  a  transfer  format,  had  two  other  very  nice 
aspects.     First,  it  relaxes  some  of  the  constraints  which  must  normally  be  considered  in 
designing  a  tape  format  for  imagery.     For  instance,  since  the  format  is  to  be  designed  for 
transferral  of  image  data  rather  than  storage,  constraints  on  packing  density  are  somewhat 
relieved.     Similarly,  since  the  format  does  not  have  to  be  used  for  the  actual  processing 
of  the  images,  constraints  relating  to  the  use  of  headers,  trailers,  and  attempts  to  design 
for  minimal  tape  motion  and  processing  time  are  also  relaxed. 

Second,  development  of  a  format  just  for  transferral  of  image  data  tends  to  focus 
attention  on  the  real  problem  -  transfer  of  all  image  related  data.     In  this  light  it 
becomes  important  to  consider  the  transferral  of  other  imagery  related  data  when  designing 
the  format.     Such  data  include  image  transforms,  calibrated  imagery  (which  may  be  composed 
of  non- integer  values),  feature  vectors,  classification  parameters,  documentation  of  the 
recording  environment,  and  even  the  image  processing  source  programs. 

It  was  with  these  considerations  in  mind  that  the  RSG-4/SGIP  developed  a  draft  format 
at  its  first  meeting  in  July  1975.     The  NATO  group  was  also  made  aware  of  similar  efforts 
then  on  going  in  the  US  by  the  EIA  and  an  IEEE  panel  on  Biomedical  Pattern  Recognition. 
The  draft  format  was  circulated  to  all  contributing  defense  laboratories  in  participating 
countries,  and  was  made  available  to  the  EIA  and  IEEE  group  in  the  US  for  comment.  This 
resulted  in  the  final  approval  of  the  format  as  shown  below  at  the  second  RSG-4/SGIP 
meeting  in  February  1976.     Experimental  testing  of  translation  programs  is  scheduled  to 
begin  shortly  with  the  exchange  of  a  test  tape  able  to  flex  all  options  (  actually  two 
tapes  -  one  7  track  and  one  9  track) .     This  tape  will  be  circulated  among  participating 
countries  before  the  next  meeting  now  scheduled  for  November  1976. 

;  Several  things  should  be  noted  about  the  format.     First,  it  is  generally  entirely 

self-documenting.     This  is  achieved  by  the  use  of  two  headers  at  the  start  of  each  file. 

Header  1  documents  the  origin,  structure  and  data  organization  of  the  file  in  a  fixed 

format.     Header  2  allows  free  format  documentation  of  all  other  parameters  pertinent  to 

the  data  (e.g.  documentation  of  recording  environment,  location  of  significant  image 

features,  recording  of  image  processing  techniques  already  applied,  etc.).     This  is 
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achieved  without  the  need  of  extra  EOF  marks,  which  usually  cause  a  problem  in  such  cases 
by  making  each  header  a  record  rather  than  a  separate  file.     Thus,  header  processing  may 
be  ignored,  if  the  file  structure  is  otherwise  known,  merely  by  skipping  the  first  two 
records  of  each  file. 

Second  the  format  is  unusual  in  that  it  separates  the  format  of  the  data  as  recorded 
on  the  tape  from  the  format  of  the  data  when  considered  to  be  an  imager.     Thus,  while  each 
image  (which  may  be  multi-spectral)  is  written  on  the  tape  as  a  separate  file  (there  may 
be  one  or  more  on  a  tape) ,  it  is  not  necessarily  true  that  each  record  contains  one  image 
line  (though  this  is  a  particular  form) .     This  not  only  allows  the  format  to  be  used  for 
transferral  of  data  other  than  images,  but  also  allows  very  long  image  lines  (e.g.  12 
channel,  pixel  interleaved,  4k  by  4k  pictures)  to  be  transferred  without  requiring  overly 
large  tape  buffers  (4k  max.  allowed) .     This  makes  the  format  applicable  to  small  computer 
owners  as  well  as  those  with  large  installations. 

The  basis  for  the  entire  tape  format  is  the  tape  character  or  byte.     This  is  taken 
to  be  8  bits  on  a  9  track  tape  and  6  bits  on  a  7  track  tape.     Alphanumeric  data  is  encoded 
in  BCD  on  7  track  tapes  and  in  a  truncated  version  of  ASCII  on  9  track  tapes  (one  tape 
character  or  byte  in  either  case) .     This  results  in  a  format  specification  for  both  7  and 
9  track  tapes  which  can  be  read  on  a  very  wide  range  of  machines  (French  MITRA,  IBM,  CDC, 
DEC  to  name  a  few)  and  which  requires  actual  translation  (not  just  repacking)  to  convert 
between  7  and  9  track  tapes. 

Header  1  may  be  broken  into  three  principle  parts  based  on  what  is  being  described. 
Bytes  1-24  concern  the  origin  and  identity  of  the  image.     This  includes  bytes  17-24  which 
is  an  8  byte  alphanumeric  identifier  given  uniquely  to  this  particular  image  file  by  its 
originator.     Bytes  25-80  describe  the  format  of  the  rest  of  the  data  in  the  file. 
Bytes  81-104  detail  the  format  of  the  data  in  the  image  (x  and  y  sizes  and  type  of  data) . 
Bytes  105-128  are  currently  unassigned  and  remain  for  future  modification  and  expansion 
(some  could  be  used  to  specify  multi-image  files,  for  instance,  though  this  is  not 
currently  under  consideration) . 
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Header  2  is  in  free  format  and  constrained  to  non-zero  length  to  encourage  good 
documentation.     Exact  contents  of  Header  2  will  depend  on  the  important  factors  of  the 
problem  at  hand. 

The  basis  of  the  format  for  the  actual  image  data  is  the  integer  format.  Integers 
are  recorded  in  ones  complement  binary  in  consecutive  tape  bytes,  most  significant  byte 
first  on  tape.     Real  numbers  are  recorded  as  separate  integers  for  mantissa  and  exponent 
(both  to  base  2  with  no  bias)  where  the  length  (in  tape  bytes)  may  be  different  for 
mantissa  and  exponent.     The  exponent  is  recorded  first.     Complex  numbers  (which  may  be 
recorded  as  integers  or  reals)  and  multi-channel  pixel  interleaved  data  are  straight- 
forward expansions  of  the  basic  format. 

The  resultant  format  can  be  generated  or  read  using  CDC  6600  utility  programs, 
and  US  standard  fortran  subprograms  are  being  written  by  the  author  to  effect  the 
translations  in  an  almost  machine  independent  fashion.     The  only  machine  dependent 
portions  will  be  one  routine  to  read  or  write  the  long  tape  records  required  (impossible 
to  do  with  A  formatted  or  unformatted  reads  and  writes  on  many  machines)  and  one  routine 
to  create  the  ones  complement  format  for  numbers  from  the  internal  format  of  the  machine 
being  used.     Both  these  routines  may  require  assembly  programming  depending  on  the 
installation.     In  addition,  positioning  to  the  correct  file  will  remain  installation 
dependent  and  it  is  recognized  that  some  installations  may  only  be  able  to  handle  single 
file  tapes. 

Distribution  of  tapes  within  the  NATO,  RSG-5/SGIP  group  will  be  made  on  a  bilateral 
decentralized  basis  as  one  participating  installation  requests  data  from  another  directly 
or  through  the  national  Project  Officers.     The  Project  Officers  will  however  make  surveys 
of  the  imagery  data  available  for  transferral  from  their  countries.     This  information  will 
be  available  to  all  participating  installations,  in  all  countries.     The  effectiveness  of 
this  distribution  method  remains  to  be  tested  and  may  depend  on  the  specific  projects 
undertaken  and  the  schemes  used  to  manage  them  -  which  are  not  yet  determined.     In  any 
case,  it  may  be  the  only  viable  scheme  within  the  equalLtarian  framework  of  NATO. 
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NATO  (RSG-4)  Tape  Format 

General:     Data  is  recorded  on  1/2  inch  wide  magnetic  computer  tape. 
Format  allows  either  7  or  9  tracks  at  800  BPI,  NRZI 
Maximum  record  length  allowed  is  4k.  tape  characters  (6  or  8  bits) 
A  tape  contains  one  or  more  files. 
Files  are  separated  by  single  EOF  marks. 
Last  file  on  tape  is  followed  by  at  least  two  EOF  marks. 
Each  file  contains  a  whole  image.     Images  too  large  for  one  tape  must  be 
divided  into  sub-images  which  are  then  handled  as  individual  images. 
Each  file  consists  of  two  header  records  followed  by  N  image  data  records. 
Header  1:     The  1st  record  of  any  file  (Header  1)  is  128  bytes  long. 

Header  1  is  coded  in  BCD  for  7  track  tapes,  even  parity  and  in  ASC  II 

(modified)  for  9  track  tapes,  odd  parity. 
Allowed  characters  are  shown  below: 

Equivalent    Octal  Number 


Symbol 

7  tracks 

9  tracks 

0 

12 

060 

1 

01 

061 

2 

02 

062 

3 

03 

063 

4 

04 

064 

5 

05 

065 

6 

06 

066 

7 

07 

067 

8 

10 

070 

9 

11 

071 

A 

61 

101 

B 

62 

102 

C 

63 

103 

D 

64 

104 

E 

65 

105 
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Equivalent    Octal  Number 

Synibol  7  tracks  9  tracks 

F  66  106 

G  67  107 

H  70  110 

I  71  111 

J  41  112 

K  42  113 

L  43  114 

M  44  115 

N  45  116 

0  46  117 

P  47  120 

Q  50  121 

R  51  122 

S  22  123 

T  23  124 

U  24  125 

V  25  126 
W  26  127 
X  27  130 

Y  30  131 
Z  31  132 

13  075 

73  056 

/  21  057 

(  34  050 

)  74  051 

t  53  044 

*  54  052 

33  054 

'  14  047 
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Equivalent     Octal  Number 
Symbol  7  tracks  9  tracks 

40  055 

+  60  053 

•b    (blank)  20  040 

new  line  57  015 


A  symbol  is  recorded  in  one  byte  on  the  tape. 
Header  1  contains  the  following  information: 
Bytes  Meaning 

1-8  Identity  of  Originator:     4  symbols  for  country,  4  symbols 

for  the  laboratory. 
Example:        USA  NVL 
9-16  Date  of  Data:     Year  Month  Day 

Example:     76  06  03 
17-24  Name/Number  of  File/Image 

Example:     GRAF  PGl 
25-32  Number  of  Records/File  (R) 

(R  =  N+2  to  account  for  2  header  records) 
33-40  Number  of  tape  characters  in  header  2  (L2)   (L2>0  required) 

41-48  Number  of  tape  characters/image  data  record  (L3) 

49-56  Number  of  channels /sample  (C) 

(One  channel  complex  data  is  treated  as  two  channel 
data  with  C  =  2  x  no.  of  channels) 
57-64  Number  of  tape  characters/channel  for  integer  data  (I) 

(1=0  for  real  values) 
65-72  Number  of  tape  characters  for  mantissa  of  a  real  valued 

channel  (E^) 

(E^  =  0  for  integer  values) 
73-80  Number  of  tape  characters  for  exponent  of  real  valued 

channel  (Eg) 

(Eg  =  )  for  integer  values) 
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81-88  Number  of  samples  per  line  (S) 

89-96  Number  of  lines  per  image  (LI) 

97-104  Type  of  data  values  (T) : 

T  =  0  non-complex,  integer 

T  =  1  complex,  integer 

T  =  2  non-complex,  real 

T  =  3  complex,  real 
105-128  Currently  not  assigned 

Header  1  example:     An  image  of  1024  x  1024  sample  points  has  32  different  grey 


levels  for  each  channel  (1-6  bit  tape  character) 

The  image  is  registered  in  3  colors.     Header  2  is  128  bytes 

long.     Image  generated  today. 


Bytes 

Contents 

Alternate  tontents 

1-8 

USA  NVL 

USA  NVL 

9-16 

76  06  03 

76  06  03 

17-24 

EXAKPLEl 

EXAMPLE2 

25-32 

t5^bb770 

00001026 

33-40 

00000128 

41-48 

-l5W3b4096 

00003072 

49-56 

i5bbbt)t)b3 

00000003 

57-64 

-bbbbbbbl 

00000001 

65-72 

t)bm-bbbO 

00000000 

73-80 

00000000 

81-88 

■bbbbl024 

00001024 

89-96 

^^1)1024 

00001024 

97-104 

-bb-bb^^-bO 

00000000 

105-128 

Blanks  or  zeros 
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Header  2:     The  2nd  record  of  any  file  contains  a  free  form  description  of  the  image. 

This  may  include  specification  of  the  physical  significance  of  the 
various  channels,  description  of  scanner,  meterological  conditions 
when  image  taken,  location  of  objects  of  interest  in  image,  etc. 
Header  2  has  variable  length  as  specified  in  Header  1  (L2) 

(0<L2-<4096  tape  characters) 
Coding  is  same  as  header  1 . 
Image  Data:     Image  data  is  coded  in  binary,  odd  parity,  one's  complement. 
Integer  format  is  basic  to  all  others. 
Integer  Format :     Most  significant  part  is  recorded  as  1st  tape  character. 

Most  significant  bit  of  1st  tape  character  is  sign  bit 

(0  =^positive,  1=^  negative) 
Most  significant  bit  is  the  left  most  bit 
Example:     For  2  tape  characters/channel,  7  track  tape: 
5  -^000000000101 
-5  ->  111111111010 

Real  Format:     The  exponent  and  mantissa  (base  2)  are  reduced  to  integers.  Both 
these  are  then  stored  as  separate  integers  on  the  tape  for 
each  sample. 

First  Eg  tape  characters  for  the  exponent,  then  Ejj,  tape  characters 

for  the  mantissa. 

Example:     lA.S^o  =  OIIIO.I2  x  2°  =  OIIIOI2  x  2"! 

lEXP  =  111110     IMANT  =  011101  where  Ep  =  E„  =  1 

'-'  m 

and  7  track  tape  is  used. 
Complex  Format:     Integer  values  (T=l)  stored  as  integers.     Real  values  (T=3)  stored 
as  real  values.     Each  channel  consists  of  two  values.  The 
first  value  is  the  real  part,  the  second  is  the  imaginary  part. 
Example: 


Real 


Imag 


Real 


X 


Imag 


Real 


Imag 


Ch  1  Ch  2  Ch  3 
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Multi-Channel  Data:     Non-registered  data  stored  as  separate  images  in  separate 

files . 

Registered  data  stored  on  pixel  interleaved  basis. 

(For  one  sample  the  values  of  the  different  channels  are 
recorded  adjacently). 
Examples:     3  color,  non-complex  data 


( 

Chi 

[       ■  ■ 

Ch2 

Ch3 

Chi 

Ch2 

Ch3 

Chi 

Y 

/ 

Sample  N 


Sample  N+1 


Chi 


2  channel,  complex  data 

Ch2  Chi 


r 

 \r   -  ■  -  ^ 

Real 

Imag 

Real 

Imag 

Real 

Imag 

Real 

Imag 

n 

■  V  

 J 

 .  ,  ^ 

Sample  N 


Sample  N+1 
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Image  Tape  Formats  Questionnaire 

1.  Please  specify  your  organizational  affiliation: 

  Academla    Industrial    Governmental    Other   

Specify 

2.  Please  characterize  your  Image  processing  Interest (s): 

 ^Biomedical   ERTS,  LANDSAT   Real  Time  Sensors  (FLIR,TV,  etc.) 

 ^OCR    Reconnaissance    Fingerprint,  Handprint 

 Other  ■  

Specify 

3.  Do  you  think  It  Is  possible  to  have  one  standard  format  for  all  applications? 
 Yes    No 

How  desirable  is  it  to  have  only  one  standard? 

The  only  good  way  The  worst  possible  way 

5  4  3  2  i 

4.  Do  you  favor  a  limited  number  of  standard  formats  for  different  application  areas? 
 Yes    No 

If  so,  how  many  do  you  think  would  be  required?   

5.  Would  your  organization  be  willing  to  use  a  defacto  standard?   ^Yes   No 

If  so,  how  long  would  adoption  of  such  a  standard  take? 

 1  mo.  or  less   1  yr  or  less    2-3  yrs    5  yrs  or  more 

6.  Is  your  organization  an  Image  data  consumer  or  producer? 
 ^Consumer   ^Producer 

What  sort  of  image  data  might  your  organization  furnish? 


What  sort  of  image  data  would  your  organization  desire  to  receiver? 
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What  is  the  best  method  for  distributing  image  data  tapes? 

  One  or  a  few  central  distribution  points 

  Exchange  between  individual  institutions 

  Both  of  the  above 

Other 


Specify 

8.     An  acceptable  format  must  include  the  use  of: 

  9  track  tapes 

  7  track  tapes 

  DECTAPES 

Other 


Specify 

9.  Comment  on  the  strong  and  weak  points  of  the  formats  presented  and  indicate  your 
preference: 

a.  NATO  format  (Dehne)    Best    Least  Useful 

b.  Large  Image  Format  (SOS)    Best    Least  Useful 

c.  User  Format  (Hawley)    Best    Least  Useful 

d.  Biomedical  Format  (Pavilidis)    Best    Least  Useful 

10.  Additional  Comments  (Use  back  of  sheet  if  necessary) : 
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Tape  Formats 
by 

Theo  Pavlidis 

Dept.  of  Electrical  Engineering  and  Computer  Sciences 
Princeton  University 

Last  year  a  task  force  on  Data  Bases  and  Portable  Software  was  formed  within  the 
Biomedical  Pattern  Recognition  Subcommittee  of  the  Machine  Intelligence  and  Pattern 
Analysis  Committee  of  the  IEEE  Computer  Society.    The  problem  of  tape  format  came  up 
quite  early  in  our  deliberations  and  after  considerable  discussion  we  felt  that  it  was 
best  to  avoid  specifying  the  format  in  too  restrictive  a  fashion  and  instead  rely  on 
good  documentation.    The  following  general  specifications  were  adopted. 

Tape  and  Data  Format 

(1)  The  data  should  be  stored  on  h  inch  magnetic  tape,  NRZI  mode,  odd  parity, 
IBM  compatible,  7  or  9  tracks,  with  9  tracks  preferable. 

(2)  Suggested  density  800  bpi  with  556  bpi  and  1600  bpi  also  acceptable. 

(3)  Block  size  (physical  record  length)  should  not  exceed  4K  (4096  8-bit  bytes). 
(This  requirement  was  imposed  in  order  to  allow  the  reading  of  tapes  by 
minicomputers  with  limited  buffer  size). 

(4)  First  file  on  the  tape  should  specify  in  EBCDC  or  BCD  format  the  contents 
of  the  tape. 

(5)  Each  item  must  be  preceded  by  identification,  preferably  in  separate 
record. 

(6)  For  gray  scale  pictures  of  size  greater  or  equal  to  64  x  64,  each  pixel 
should  be  a  byte,  each  line  a  physical  record,  each  picture  a  file. 

(7)  EBCDC  or  BCD  formats  are  preferable  to  binary. 
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The  above  specifications  are  quite  loose  and  can  hardly  qualify  as  a  standard. 
However  there  might  be  situations  where  even  they  can  still  be  too  strict.    For  example 
there  are  applications  where  one  deals  with  very  large  binary  pictures,  512  x  512  or 
even  greater.    Printed  wiring  boards  are  one  such  instance.    Storing  one  pixel  per  byte 
would  require  8  times  as  much  space  as  storing  the  maximum  8  pixels  per  byte.    If  an 
unpacking  program  is  available,  the  information  can  be  retrieved  easily.    Such  a  program 
should  be  also  part  of  the  processing  one  since  storing  in  core  512  x  512  bytes  can  be 
quite  a  problem  for  most  installations. 

This  points  out  the  problems  which  must  be  dealt  with  in  establishing  standards. 
The  variety  of  picture  sizes  and  types  makes  a  universal  standard  impossible.    At  one 
end  we  may  have  data  bases  consisting  of  many  thousands  of  small  (e.g.  24  x  24)  binary 
pictures.    At  the  other  end  we  may  find  very  large  (e.g.  1024  x  1024)  grey  scale  pictures, 
like  radiographs  with  each  data  base  consisting  of  not  more  than  100  of  them.    The  one 
picture  per  file  specification  may  be  reasonable  in  the  second  case  but  not  in  the 
first.    Also  bit  packing  is  possible  only  in  the  first  case.    (IEEE  Data  Base  1.2.2  which 
contains  alphanumeric  data  is  of  this  nature  and  it  packs  3  bits  per  integer.    The  bit 
retrieval  is  quite  easy  using  a  high  level  language  like  FORTRAN). 

A  major  problem  in  reading  tapes  from  other  installations  is  the  handling  of  special 
symbols  entered  by  the  operating  system  when  a  tape  is  written  by  a  high  level  language. 
Binary  tapes  would  not  pose  this  problem  if  written  with  proper  care.    However  such  tapes 
would  be  very  difficult  to  read  when  transferring  data  between  machines  having  bit-byte 
configurations  (e.g.  IBM  and  CDC). 

It  seems  that  we  may  need  more  than  one  standard  and  that  for  the  immediate  future 
the  best  hope  in  facilitating  transfer  of  data  bases  may  be  good  documentation.  The 
establishment  of  separate  standards  for  each  application    (e.g.  satellite  pictures, 
radiographs,  alphanumerics  etc.)  may  also  be  a  possibility.    Areas  where  a  standard 
has  not  been  established  could  use  one  from  their  "nearest  neighbor." 
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Earth  Observation  Image  Data  Format 


by 

John  Y.  Sos 

Head,  Image  Processing  Branch 
NASA  Goddard  Space  Flight  Center 
Greenbelt,  Maryland  20771 


Experience  with  processing  and  disseminating  Landsat  imagery  within  the  NASA  Data 
Processing  Facility  (NDPF),  located  at  the  Goddard  Space    Flight  Center  (GSFC),  has  been 
applied  to  the  development  of  a  flexible  format  for  Computer  Compatable  Tape  (CCT)  con- 
taining multispectral  Earth  observation  sensor  data.    The  driving  functions  which  com- 
prise the  data  format  requirements  are  summarized  in  Table  1.    Using  these  drivers  as 
a  guide,  coupled  with  four  years  of  experience  in  Landsat  image  processing,  the  following 
general  data  format  guidelines  emerge: 

•  Open  workspaces  must  exist  within  the  tape  format  so  that  all 
subsystems  in  a  data  processing  chain  could  easily  append,  de- 
lete or  modify  information 

•  The  tape  format  should  be  organized  and  segmented  along  the 
following  logical  lines: 

1.  The  first  information  encountered  should  be  introductory 
and  provide  self  identification  standards  in  byte  conven- 
tion 

2.  Afterwards  a  generic  description  of  all  data  fields  and 
data  within  those  data  fields  should  be  provided  in 
standard  byte  conventions 

3.  Specific  information  should  be  provided  to  decode  and 
interpret  data  fields  and  data  within  those  data  fields 
in  byte  conventions 

4.  Ancillary  and  annotation  information  should  be  provided 
in  compact  form  to  accompany  the  actual  imagery 

5.  The  imagery  itself  should  be  presented  in  a  compact  form 
with  a  minimal  amount  of  unique  identification 

6.  The  blocking  of  data  should  be  optimized  around  the  units 
of  imagery  acquisition 

7.  A  summary  should  follow  to  conclude  a  major  image  data  set 

The  implementation  of  the  above  guidelines  are  shown  in  Figures  1  through  3,  Figure 
1  illustrates  the  organization  and  structure  of  individual  CCT  records.    Figure  2  shows 
a  high  level  layout  of  a  multispectral  CCT  with  detailed  CCT  features  shown  in  Figure  3. 
Examples  of  three  possible  arrangements  (or  interleaving)  of  Landsat  MSS  image  data  are 
presented  in  Figure  4.    More  detailed  information  concerning  these  features  are  found  in 
the  "Quick  Look  Processor  CCT  Data  Format  Specification"  dated  July  31,  1975,  prepared 
under  contract  NAS5-24033,  Mod  26  for  GSFC  by  Operations  Research,  Inc. 

It  is  planned  to  introduce  the  proposed  formats  in  1978  as  a  new  digital  Image 
Processing  Facility    (IPF)  becomes  operational  at  GSFC.    As  a  first  step  in  the  systematic 
approach  to  image  data  format  standardization,  a  data  format  which  meets  the  general  re- 
quirements will  be  introduced  by  NASA,  the  U.S.  Department  of  Interior's  EROS  Data  Center, 
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TABLE  1 


EARTH   OBSERVATION  IMAGE 
DATA  FORMAT  DRIVERS 

•  MULTIPLE  IMAGE  SENSORS 

-  LANDSAT  (MSS,  RBV)* 

-  NIMBUS  (CZCS,  THIR)* 

-  SMS  (VISSR)* 

•  VARYING  SENSOR  CHARACTERISTICS 

-  NUMBER  OF  SPECIAL  BANDS  (1-10) 

-  NUMBER  OF  PIXELS/LINE  (1,000-20,000) 

-  NUMBER  OF  LINES/SCENE  (1,000-20,000) 

•  MULTIPLE  USERS 
-R&D 

-  PRODUCTION 

•  VARIOUS  USER  PROCESSING  REQUIREMENTS 

-  MULTISPECTRAL  CLASSIFICATION 

-  BLACK  &  WHITE  FILM  PRODUCT  GENERATION 

-  FALSE  COLOR  COMPOSITES 

-  SPATIAL  FILTERING 

-  PATTERN  RECOGNITION 

•  DATA  VOLUME 

-  HIGH  IMAGE  DATA  VOLUME  (10^) 

-  LOW  ANCILLARY  DATA  OVERHEAD 

•  MULTISPECTRAL  IMAGE  DATA  FORMATS 

-  BAND  INTERLEAVED  BY  LINE  (BIL) 

-  BAND  INTERLEAVED  BY  PIXEL  (BIP) 

-  BAND  SEQUENTIAL  (BSQ) 


MSS     -  MULTISPECTRAL  SCANNER 

RBV    -  RETURN  BEAM  VIDICON 

CZCS   -  COASTAL  ZONE  COLOR  SCANNER 

THIR   -  TEMPERATURE  HUMIDITY  INFARED  RADIOMETER 

VISSR  -  VISIBLE  AND  INFARED  SPIN-SCAN  RADIOMETER 
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FIGURE  1. 

CCT  RECORD  FORMATS 
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FIGURE?. 

HIGH  LEVEL  REPRESENTATION  OF  A  MULTISPECTRAL  CCT 
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FIGURES. 

DETAILED  CCT  FORMAT 
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FIGURE  4. 

THREE  ARRANGEMENTS  OF  MSS  IMAGE  DATA 
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and  the  Canada  Center  for  Remote  Sensing  late  this  year.    This  format  however  will  be 
limited  to  the  Band  Interleaved  by  Line  (BIL)  format  of  Landsat  MSS  image  data  and  will 
not  contain  several  features  such  as  the  record  number  field,  record  type  code,  and  the 
spare  data  field  as  shown  in  Figure  1.    This  specific  format  is  described  in  the  follow- 
ing two  documents  :  1)  "Format  Specifications  for  Canadian  Landsat  MSS  System  Corrected 
Computer  Compatible  Tape",  Research  Report  75-3,  dated  July  1975,  by  the  Canada  Center 
for  Remote  Sensing,  Department  of  Energy,  Mines,  and  Resources,  Ottawa;  and,  2)  "Format 
Description  of  the  U.S.  Landsat  MSS  Universal  Computer  Tape",  by  Lottie  E.  Brown,  GSFC 
Document  X-563-76-40,  June  1976,  (to  be  released  July  1976). 
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IMAGE      CONTENT     AND  STRUCTURE 


Chairpersons:  Azriel  Rosenfeld,  University  of  Maryland 

Russell  A.  Kirsch,  Applied  Mathematics  Division,  NBS 

Panelists:    M.  A.  Fischler,  Lockheed  Research  Lab 
K.  S.  Fu,  Purdue  University 
J.  O'Callaghan,  CSIRO,  Australia 
R.  F.  Sproull,  Xerox  Corp. 
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Image  Content  and  Structure 
Chairpersons:    Azriel  Rosenfeld  and  Russell  A.  Kirsch 

To  make  images  maximally  useful  for  interchange,  it  is  necessary  for  the  provider  of  the 
data  to  furnish  descriptions  of  the  image  content.    At  the  simplest  level  such  descriptions 
refer  to  the  image  as  a  whole,  but  most  interesting  and  important  image  sources  require  ar- 
ticulated descriptions  of  the  image  structure  giving  the  relations  of  the  parts  of  the  image 
as  well  as  their  names.    Such  image  descriptions  are  useful  not  only  for  image  data  base 
management,  but  also  as  standards  for  evaluating  the  success  of  pattern  recognition 
algorithms . 

One  current  research  problem  is  the  development  of  suitable  languages  in  which  to 
embed  descriptions  of  image  content  and  structure.    This  session  of  the  Workshop  was  devoted 
to  a  discussion  of  the  status  of  research  and  related  work  in  graphic  languages  to  provide 
the  needed  basis  for  image  content  and  structure  description. 

The  panelists  were  Dr.  Martin  A.  Fischler  from  Lockheed  Research  Laboratories,  Pro- 
fessor K.  S.  Fu  from  Purdue  University,  Dr.  John  O'Callaghan  from  Commonwealth  Scientific 
and  Industrial  Research  Organization  in  Australia,  and  Dr.  Robert  F.  Sproull  from  Xerox 
Corporation.    The  Chairpersons  of  the  session  were  Professor  Azriel  Rosenfeld  from  the 
University  of  Maryland  and  Russell  A.  Kirsch  from  the  National  Bureau  of  Standards. 

Rosenfeld  began  the  session  with  a  set  of  cautionary  remarks  concerning  the  difficulty 
of  image  description.    At  the  first  level,  descriptions  can  be  associated  with  whole  images; 
an  example  of  such  a  description  would  be  the  kind  of  header  information  included  in  the 
tape  formats  that  were  described  in  the  previous  session,  but  the  typical  header  would 
not  include  subject  matter  associated  with  the  content  of  the  image.    Furthermore,  the 
content  can  include  anything,  and  may  have  various   interpretations   depending  on  who  the 
user  is  for  an  image.    At  a  somewhat  deeper  level  of  description,  images  can  be  described 
in  terms  of  the  names  of  their  parts;  map  overlays  are  a  good  example.    But  here  we  confront 
the  problem  that  noise  for  one  purpose  may  be  signal  for  another.    Furthermore,  the  parts 
of  objects  in  real  world  images  are  very  often  fuzzy  with  respect  to  their  boundaries. 
Finally,  specialized  descriptions  can  be  very  subtle,  and  they  can  be  nonsyntactic ,  referring 
to  extra-pictorial  information.    The  very  general  purpose  descriptions,  for  example,  "to 
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the  left  of,"  can  be  very  difficult  to  implement  in  recognizers.    (Winston  at  MIT  discussed 
this  example  in  his  Ph.D.  thesis). 

Dr.  Fischler  then  discussed  scene  description  for  images  of  real  scenes.    After  dis- 
tinguishing scenes  from  their  images  he  discussed  the  possibility  of  describing  scenes 
with  picture  description  languages.    Since  a  picture  description  language  describes  a 
single  image  and  does  not  describe  all  aspects  of  a  real  scene,  for  example,  hidden  ob.iects 
and  hidden  relations,  it  seems  difficult  to  see  how  a  picture  description  language  can 
be  used  to  describe  a  real  scene.    Furthermore,  insofar  as  a  picture  description  language 
is  symbolic,  there  are  many  implied  relations  in  the  real  scene  which  cannot  be  represented 
in  the  symbols  unless  the  symbols  themselves  are  isomorphic  with  the  real  scene.    He  then 
discussed  certain  psychophysical  visual  illusions,  and  argued  that  image  perception  includes 
the  consequences  of  visual  illusions  which  must  also  be  comprehended  within  image  descrip- 
tion.   He  suggested  that  although  symbols  might  be  difficult  to  use  in  describing  pictures, 
the  pictures  themselves  or  pictorial  counterparts  such  as  overlays  might  serve  such  a  purpose. 
In  the  discussion  Dr.  Sproull  pointed  out  that  one  should  be  concerned  with  either  describ- 
ing pictorial  scenes  or  psychological  artifacts,  not  both  simultaneously,  and  that  the 
problem  of  describing  scenes  can  be  separated  from  that  of  describing  the  perceptual  pro- 
cesses for  viewing  them.    Dr.  O'Callaghan  suggested  that  once  the  context  for  describing 
an  image  is  given,  this  simplifies  the  problem  of  one  image  being  many  things  to  many 
people. 

In  Professor  Fu's  discussion  he  directed  his  attention  to  hierarchical  languages. 
A  description  that  might  be  good  for  characterizing  the  structure  of  an  image  might  not 
be  good  for  classification  purposes.    He  showed  tree  and  graph  structures  for  describing 
images  and  gave  examples  for  both  earth  satellite  imagery  and  for  fingerprint  classifica- 
tion images.    He  showed  how  a  fingerprint  may  be  partitioned  into  disjoint  regions  which  can 
themselves  be  easily  classified,  although  this  partitioning  does  not  correspond  to  that  used 
by  fingerprint  experts.    He  showed  a  similar  graph  structure  for  describing  a  LANDSAT 
picture  classified  for  land  use  purposes. 

In  Dr.  O'Callaghan's  remarks  he  agreed  that  descriptions  are  relative  to  the  context 

and  purpose  that  they  serve.    However,  for  certain  pictures,  we  can  sufficiently  describe 

the  context  and  purpose  to  enable  structural  descriptions  to  be  furnished.    Such  structural 

descriptions  can  be  based  on  objects,  relations,  and  their  attributes.    Although  such 
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descriptions  can  be  adduced  from  images,  the  descriptions  cannot  necessarily  be  used  to 
direct  automatic  analysis  or  synthesis  of  images.    Dr.  O'Callaghan  made  the  important 
distinction  between  picture  description  languages  which  can  be  processed  by  computer 
mechanisms  and  those  which  need  not  have  computer  implementations  if  they  are  to  serve 
as  the  basis  for  human  interchange.    He  exhibited  a  solution  to  the  problem  of  image  content 
and  structure  description  in  the  application  of  wood  morphology,  in  which  he  exhibited 
the  microstructure  of  wood  tissue  in  terms  of  the  constituent  parts  and  their  relations. 
In  the  discussion  David  Milgram  commented  that  there  is  a  conflict  between  Professor  Fu's 
computer  languages  and  0' Cal laghan ' s  human  languages,  one  presuming  the  possibility  of  com- 
puter implementation  and  the  other  making  no  such  presumption.    Fischler  agreed  that  there 
need  not  be  a  single  universal  language  ranging  over  different  subject  matters  and  ranging 
from  computers  to  people.    John  Dehne  concurred  that  there  are  perhaps  two  kinds  of  langu- 
ages, those  for  describing  a  class  of  images  and  those  for  describing  a  particular  one. 

Dr.  Sproull  began  his  remarks  by  pointing  out  that  some  people  believe  if  you  can 
understand  something  than  you  can  synthesize  it.    This  is  somewhat  the  approach  taken  in 
the  computer  graphics  field  with  respect  to  image  synthesis.    However,  in  computer  graphics, 
the  effects  depend  more  on  iconic  properties  than  they  do  on  realism.    He  gave  a  brief 
survey  of  computer  graphics  activites  in  the  areas  of  modeling  and  description,  which  are 
mostly  concerned  with  geometric  and  topological  properties,  and  hardly  with  photometry; 
in  the  area  of  image  synthesis,  which  has  dealt  primarily  with  such  problems  as  hidden 
line  and  hidden  surface  removal;  the  area  of  synthesis  and  analysis,  in  which  the  key  to 
analysis  of  images  is  a  comparison  at  higher  levels  than  the  individual  pixels  of  image 
pairs;  and  finally,  the  area  of  representation,  in  which  the  different  types  of  data 
structures  that  serve  for  computer  graphics  purposes  were  described.    He  summarized  the 
problem  of  structural  characterization  as  the  problem  of  deciding  what  questions  should 
be  asked  of  an  image.    Once  such  a  decision  is  made,  the  problem  of  structural  characteriza- 
tion becomes  simplified. 

Kirsch  concluded  the  session  with  some  remarks  concerning  description  of  actual  scenes. 
He  suggested  that  very  few  serious  attempts  have  been  made  at  describing  real  scenes. 
Most  such  attempts  have  been  concerned  primarily  with  artificial  scenes  as  in  the  computer 
graphics  area.    0' Cal 1 aghan ' s  example  for  wood  morphology  would  be  one  of  the  rare  instances 
in  which  a  successful  attempt  has  been  made.    With  respect  to  mechanisms  for  picture 
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description  he  offered  a  caveat  to  SproulTs  proposal  that  programming  languages  be  used 
since  programming  languages  are  sufficiently  powerful  that  structural  descriptions  embedded 
-  in  them  need  not  lead  easily  to  analysis  procedures.    He  suggested  that  Fischler's  diffi- 
culty in  including  psychological  properties  in  image  description  goes  beyond  the  purposes 
the  image  description  should  serve.    He  concluded  by  suggesting  that  the  question  of 
whether  image  content  and  structure  can  be  described  in  suitable  artificial  languages  is 
still  an  open  question  largely  through  lack  of  serious  large  scale  attempts  to  deal  with 
the  question. 
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ON  COMMUNICATING  ABOUT  PICTURES 


Martin  A. 
Lockheed  Palo  Alto 

This  session  of  the  workshop  is  con- 
cerned with  evaluating  the  prospects  for 
finding  a  suitable  Picture  Description 
Language  (PDL)  to  facilitate  communica- 
tion among  Image  Pattern  Recognition 
(IPR)  researchers.    It  is  therefore 
appropriate  to  ask  whether  such  a  lang- 
uage is  necessary,  and  what  specific 
functions  it  could  serve. 
Specification  of  Classification  Rules 

The  most  obvious  use  for  such  a 
language  would  be  to  specify  classifica- 
tion rules  for  the  imagery  being  pro- 
cessed.   However,  a  representational 
scheme  general  and  powerful  enoiigh  to 
effectively  satisfy  this  objective  would 
probably  also  provide  a  solution  for  most 
of  the  remaining  IPR  problems  (a  rather 
ambitious  goal) .    If  such  a  scheme  lacked 
the  required  generality,  it  would  hinder 
rather  than  aid  communication.    While  the 
near  term  prospects  for  finding  such  a 
language  do  not  look  promising,  it  is 
still  useful  to  consider  what  character- 
istics such  a  language  must  have . 
Description  For  Scene  Reconstruction 

A  second  possible  use  for  a  PDL  is 
related  to  the  fact  that  in  most  current 
IPR  work  (especially  by  workers  having  a 
"syntactic"  orientation),  no  distinction 
is  made  between  a  scene  and  an  image . 
Indeed,  for  line  type  drawings,  there  is 
no  difference.    However,  for  real  (or 
natural)  scenes,  a  single  image  is  not  an 
adequate  representation  for  many  purposes 
(e.g.,  for  determining  whether  or  not  two 
images  correspond  to  the  same  scene). 


Fischler 

Research  Laboratory 

Not  only  does  a  real  scene  have  a 
3-dimensional  aspect  which  cannot  be 
captured  in  a  single  image,  but  most 
scenes  are  dynamic  entities  to  some  extent 
(e.g.,  leaves  on  trees  appear  and  disappear 
with  the  seasons,  move  in  the  wind,  etc.); 
further,  an  image  is  a  function  of  the  sen- 
sor, illumination  source,  transmission 
medium,  and  a  host  of  other  factors  in 
addition  to  the  actual  scene  content. 

If,  as  indicated  above,  we  cannot 
adequately  describe  a  scene  with  a  single 
image,  then  how  can  an  IPR  researcher  com- 
municate or  record  a  suitable  representa- 
tion of  the  subject  of  his  investigations. 
Perhaps  some  PDL,  or  more  likely,  a  combi- 
nation of  both  imagery  and  linguistic 
representation  can  prove  suitable. 
Desirable  Characteristics  of  a  PDL 

Let  us  now  address  the  question  of 
the  characteristics  of  a  PDL  which  could 
simplify  the  two  scene  description  tasks 
mentioned  above  (i.e.,  specification  of 
classification  rules,  and  description 
suitable  for  scene  reconstruction).  In 
particular,  can  a  purely  symbolic  language 
(i.e.,  one  in  which  the  relationship 
between  the  symbolic  tokens  and  their  real 
world  referents  is  completely  arbitrary) 
serve  this  function?    We  might  note  that  in 
spite  of  the  combined  power  of  natural 
English  and  mathematical  notation,  we  still 
find  it  difficult  to  communicate  about  many 
subjects  without  resorting  to  graphics  and 
use  of  pictorial  materials .  Symbolic 
languages  pose  three  problems  which  I 
believe  makes  them  (by  themselves)  unsuit- 
able for  general  picture  description. 
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These  are: 

a)    The  need  to  explicitly  enumerate 
the  various  relationships  between  scene 
primitives  J  as  well  as  the  need  to  define 
and  completely  decompose  a  scene  into 
these  primitives.    This  is  a  task  which  is 
not  practical  for  typical  natural  scenes. 

Id)    The  requirement  for  the  satis- 
faction of  the  implicit  assumption  that 
the  number  of  variables  needed  to  describe 
a  scene  is  relatively  small.  Symbolic 
techniques  do  not  lend  themselves  to 
effective  procedures  in  large  number-of- 
variables  problems.    There  is  no  reason 
to  believe  that  natural  scenes  (e.g., 
terrain  scenes)  can  be  encoded  into 
messages  containing  relatively  small 
numbers  of  variables  without  significant 
loss  of  information. 

c)    Our  lack  of  knowledge,  or  in- 
ability to  describe,  the  psychological 
correlates  of  a  pictorial  object.  E.G., 
we  are  often  surprised  or  amused  by 
optical  illusions,  impossible  objects,  etc. 
Any  representational  form  which  is  insensi- 
tive to  such  information  will  prove  de- 
fective in  applications  where  human 
response  to  pictorial  data  is  a  considera- 
tion. 

Thus,  if  it  is  indeed  the  case  that 
we  cannot  completely  replace  pictiares  with 


strings  of  abstract  symbols  (and  this  is 
the  approach  taken  in  almost  all  current 
research  on  computer  based  picture  lang- 
uages), we  must  see  what  can  be  done  to 
use  some  combination  of  symbols  and  pict- 
ures, to  communicate  about  pictures »  A 
combined  approach  is  probably  necessary, 
since  pictures  (or  picture  segments)  are 
limited  in  their  ability  to  imply  scene 
content  which  is  not  explicitly  visible, 
while  as  noted  above,  symbolic  description 
cannot  cope  with  the  undirected  complexity 
and  psychological  implications  characteris- 
tic of  natural  scenes. 
Conclusions 

In  conclusion,  I  believe  that  our 
current  state  of  knowledge  about  how  to 
characterize  natural  scenes  is  inadequate 
to  anticipate  near  term  standardization 
of  methods  for  communication  about  such 
scenes.    However,  I  do  believe  that  it 
may  be  possible,  at  this  time,  to  specify 
minimal  criteria  for  documenting  IPR  work 
employing  such  scenes.    To  the  extent  that 
we  restrict  our  attention  to  artificial 
scenes,  special  applications,  or  line 
type  drawings,  standardization  of  communi- 
cation using  a  symbolic  PDL  may  be  current- 
ly possible;  but  even  here,  the  value  of 
such  standardization  to  the  IPR  community 
as  a  whole  is  open  to  serious  question. 
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SYNTACTIC  APPROACH  TO  THE  DESCRIPTION 
OF  IMAGE  STRUCTURE  AND  CONTENT 


K.  S.  Fu 

Purdue  University 
W.  Lafayette,   Indiana  i»7907 

Syntactic  approach  to  the  description  of  image  content  and  structure  is  discussed. 
Image  structures  can  be  described  in  terms  of  trees  and  graphs  with  nodes  representing 
pattern  primitives  and  subpatterns ;  and  branches  representing  relations  between  sub- 
patterns  and  primitives.     Image  content  can  be  described  in  terms  of  the  features  or 
attributes  (geometric  and/or  texture  measurements)  of  the  primitives  and  subpatterns. 
These  feature  measurements  could  be  interpreted  as  the  semantic  information  of  primitives 
and  subpatterns.     The  information  of  structure  and  content  can  also  be  used  for  image  data 
management  (storage  and  retrieval).    The  correctness  and/or  classification  of  such  image 
content  and  structure  description  can  be  verified  from  the  grammar  generating  the  tree 
or  graph  structures  with  the  associated  semantic  conditions.     Examples  are  given  to 
illustrate  this  approach. 
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LANGUAGES  FOR  IMAGE  CONTENT  AND  STRUCTURE 
J.  F.  O'Callaghan 
CSIRO  Division  of  Computing  Research,  Australia 

The  following  statements  outline  my  position: 

1.  Research  in  picture  interpretation  and  graphics  has  developed  various 
languages  and  representations  for  images  (e.g.  tree,  array,  plex,  web 
grammars).     Such  representations  are  predicated  on  structural  frameworks 
of  objects,  their  parts,  relations  and  attributes.    An  image  may  be 
described  by  presenting  the  representation  at  a  certain  level  of  detail  or 
by  relating  one  representation  to  another, 

2.  Most  research  efforts  have  been  confined  to  restricted  classes  of  line 
drawings,  using  well-defined  geometrical  relations.    It  has  proved  difficult 
to  computationally  define  descriptions  which  are  meaningful  for  humans  and 
which  account  for  'real-world'  data.    One  problem  is  the  ambiguous  nature 

of  the  data  amongst  different  observers. 

3.  Attempts  to  automatically  extract  (or  synthesize)  descriptions  for  given 
images  have  largely  failed.    A  major  reason  has  been  the  problem-solving 
nature  of  the  extraction  process,  requiring  information  not  usually  supplied 
with  structural  representations. 

4.  Given  the  status  of  research,  descriptions  of  images  for  interchange  must  be 
considered  within  restricted  domains  and  purposes  (e.g.  chest  x-rays  for 
demonstration  of  lung  cancer),  where  a  context  for  term  definitions  could  be 
unambiguously  established.    It  would  not  be  possible  in  general  to  automat- 
ically extract  descriptions  of  content  and  structure  from  image  data. 
However,  it  is  conceivable  that  such  descriptions  could  be  transported  and 
interrogated  as  a  data  base  in  a  restricted  dialogue. 
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Constructive  Descriptions  of  Images 


Robert  F.  SprouU 
Xerox  Palo  Alto  Research  Center 


A  common  method  for  documenting  the  structure 
of  an  object  is  to  document  the  process  of 
constructing  it  from  a  small  set  of  primitive 
ingredients.  Such  structural  synthesis  of  images  is 
common  in  computer  graphics,  which  has 
concentrated  since  inception  on  generation  rather 
than  analysis  of  images.  This  note  recounts 
computer  graphics  techniques  applicable  to 
describing  images  but  does  not  offer  any  new 
solutions. 

The  constructive  descriptions  used  in  graphics  are 
often  abstract  models  of  a  real  situation:  images 
generated    from    the    models    are    intended  to 
communicate     information     to     a     human  who 
interprets    the    abstraction.        For    example,  a 
computer  model  of  an  electronic  circuit  is  often 
displayed  using  conventional  abstract  symbology  of 
electrical  engineering.   The  model  may  also  contain 
additional  information  adequate  for  calculating  the 
\    physical    placement    of    components,    the  circuit 
response,  or  the  manufacturing  cost.   The  design  of 
i    the     descriptive     model     is     governed     by  the 
I    information   it  must   represent,   that   is,   by  the 
j    questions  that  are  asked  of  the  model. 

We  will  survey  two  specific  sorts  of  models  that 
I  may  be  suggestive  of  ways  to  model  the  structure 
j  of  some  images.  First,  we  can  attempt  to  model 
i  physical  objects  and  the  processes  by  which  images 
i  of  these  objects  are  made  on  the  retina.  The 
j  model,  or  information  derived  from  it,  yields 
j  information  about  the  structure  of  the  image. 
Second,  we  can  use  abstract  graphics  models  to 
describe  the  image  itself,  independent  of  the 
;    processes  that  generated  it. 

Three-Dimensional  Geometric  Models 

Techniques  have  emerged  from  computer  graphics 
research   for   modeling   certain  three-dimensional 
j    objects    and    for    producing    somewhat  realistic 
images  of  scenes  composed  of  these  objects.  The 
geometric     models     of     objects     commonly  use 
I    polyhedra,    parametric    polynomial    surfaces  and 
j    conic    surfaces    as    primitive    elements    [11,  1]. 
1    Increasingly,   computational   geometric  techniques 
seek    to    model     complex    objects    or  complex 
I    assemblies    of    objects    by    sequences    of  simple 
I    operations  applied  to  primitive  objects  [2,  4,  8]. 

Current   methods    for   synthesizing    images  from 
such  models  are  compromises  between  efficiency 
I    and  proper  simulations  of  imaging  [12].  Although 


some  hidden-surface  elimination  techniques 
correctly  solve  the  geometric  problem  of 
determining  which  surfaces  are  visible,  only  crude 
illumination  models  are  used  to  calculate 
intensities  on  the  surface.  These  methods  are 
adequate  for  creating  images  intelligible  to 
humans,  but  fall  far  short  of  simulations  of  reality. 

To  decide  whether  a  model  captures  the  structure 

of  a  natural  image,  we  can  compare  the  image  with 

one  synthesized  from  the  model.    The  comparison 

can  be  done  on  the  sampled  image  directly,  or  , 

perhaps   after    some    processing   such    as    feature  I 

extraction    [6,    3].        Errors    detected    by  the 

comparison  need  to  be  related  to  an  error  in  the 

model,  which  may  represent  a  structural  fault  (e.g., 

omission  of  an  object),  or  a  less  serious  difficulty 

(e.g.,  improper  illumination  of  an  object).      See  | 

Figure  1.  ! 

The  practical  application  of  these  techniques  is 
severely  limited  to  a  small  class  of  scenes. 
However,  the  class  increases  as  humans 
increasingly  construct  physical  artifacts  that  have 
geometrical  models:  the  number  of  bits  used  to 
describe  the  shape  of  a  General  Motors  automobile 
descends  as  more  of  its  parts  are  designed  by  a 
computer-aided-design  system. 

The  class  of  images  susceptible  to  synthesis  from 
models  can  doubtless  be  increased  by  more 
research.  For  example,  why  is  it  not  possible  to 
build  models  of  X-ray  imaging  processes  and  of 
chest  cavities  to  describe  a  patient's  chest  and  the 
corresponding  chest  X-ray?  The  benefits  of  a 
physical  analog,  the  phantom  patient,  are  evident  -, 
in  training  radiographers. 

( 

Two-Dimensional  Image  Descriptions 

If  synthesis  from  a  model  of  the  scene  is  not  ' 
feasible,    it    may    be    possible    to    describe  the 
structure  of  the  image  as  a  synthesis  of  regions, 
each  of  which  plays  a  structurally  relevant  role. 
We  can  describe  a  region  by  a  description  of  the 
region     boundaries,     together     with  annotations 
which  identify  aiid  describe  the  region  and  may 
relate  it  to  other  regions  in  the  image.  Boundary 
descriptions  can  be  simple  closed  geometric 
figures  (eg,  polygons,  conic  sections)  or  se-  j 
quences  of  primitive  boundary  elements  (eg,  lines 
parametric  curves,  or  even  points)  [10].     It  is  a 
simple   matter   to   determine   from   any   of  these 
descriptions  which  pixels  lie  within  a  region.    The  I 


relations  among  regions  contain  valuable 
information  about  the  structure  of  the  image.  See 
[8]  for  an  example  of  two-dimensional  region 
descriptions  and  of  relations  among  the  regions. 

The  region  descriptions  can  be  viewed  as  a  set  of 
"overlays,"  each  of  which  describes  a  region  of 
structural  interest  in  the  image.  The  spatial 
descriptions  and  annotations  can  also  be  viewed  as 
a  recipe  for  constructing  the  image  from  its 
structural  components. 

Representations 

Straightforward  representations  of  regions  that  are 
readable  by  both  humans  and  computers  are 
feasible.  Perhaps  the  basic  element  of  description 
is   the   relation;    although   the   examples   use  the 


Figure  1.  An  example  of  comparing  a  synthetic 
image  and  an  image  derived  from  a  natural  scene. 
The  top  image  is  synthesized  from  a  geometric 
model  of  a  water  pump  and  impeller;  the  bottom 
image  is  derived  from  a  natural  image  of  the 
scene.     Taken  from  [3]. 


notation  of  Leap  [7],  the  methods  of  relational 
data  bases  [5]  are  equally  applicable.  Relations  are 
expressed  as  triples  that  relate  three  items:  A  ®  0 
=  V.  This  can  be  read  as  "A  of  O  is  V."  An  item 
may  have  a  datum,  consisting  of  numerical  or 
string  data  to  be  associated  with  the  item.  For 
example,  an  item  that  represents  a  boundary  might 
have  as  datum  a  string  that  gives  the  chain 
encoding  of  the  boundary. 

The  simplified  example  below  shows  how  one 
might  describe  an  image  consisting  of  a  single 
cell.  The  boundary  descriptions  for  the  cell  and 
nucleus  have  presumably  been  traced  by  an  expert. 
Appropriate  medical  findings  are  also  recorded: 

;     Description  of  the  parts  of  the  image. 
PartOf  ®  Image  =  Celll 

;  Description  of  the  cell,  and  enumeration  of  its 
parts. 

PartOf  ®  Celll   =  Membranel 
PartOf  ®  Celll   =  Nucleusl 
Type   ®  Celll   =  RedBloodCell 
Pathology  (8)  Celll   =  Normal 

;    Relations  between  regions  and  structural  parts 

(nucleus,  cell) 

Region  <8i  Celll  =  Regionl 

Region  ®  Nucleusl   =  Region2 

Within  ®  Regionl   =  Region2 

;     Properties  of  regions 

Boundary  ®  Regionl   =  Boundaryl 

Boundary  ®  Region2  =  Boundary2 

;  Coding  information  for  the  boundaries  of  the 
regions 

CodingType  ®  Boundaryl  =  Chain 
CodingType  ®   Boundary2  =  Chain 

;       A  chain  coding  (taken  from  Freeman) 
datum(Boundaryl)         =  "042600001042700003 
22111070067655445442240400" 

datum(Boundary2)  =  "042600003042700003 
2210765530400" 

This  example  shows  but  one  of  many  possible  ways 
to  record  information  about  the  structure  and 
relationships  among  regions  of  an  image.  For 
example,  we  could  give  a  computer  program  that 
will  synthesize  a  similar  image,  provided  the 
program  itself  conveys  structural  information. 

The  appeal  of  descriptions  such  as  these  is  that 
they  can  be  reasonably  understood  by  humans,  and, 
if  properly  designed,  can  be  read  and  processed  by 
computer.  An  image  can  thus  carry  with  it  a  data 
base  to  which  important  questions  can  be 
addressed:  it  can  record  the  analysis  of  expert 
observers  of  the  original  image.  It  should  be 
feasible  with  present  understanding  to  design  a 
standard  representation  for  this  data  base. 
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Documentation  of  the  Recording  Environment 
Chairpersons:    Judith  M.S.  Prewitt  and  John  M.  Fvans,  Jr. 

The  objective  of  this  session  was  to  identify  the  common  problems  associated  with 
the  transformation  from  a  physical  object  to  a  digitized  image.    Optics,  electronics, 
analog-to-digital  converters  all  influence  the  resulting  data  and  their  effects  must 
be  calibrated  and  documented  to  insure  correct  and  successful  subsequent  processing  in 
image  pattern  recognition  work. 

The  session  was  organized  as  a  series  of  eight  presentations  by  speakers  experienced 
in  and  expert  in  selected  diverse  and  complementary  aspects  of  this  set  of  problems. 
The  speakers  discussed  both  the  generic  problems  associated  with  image  digitization 
and  those  specific  problems  associated  with  medical  and  satellite  imagery  applications. 

The  questions  and  answers  for  each  topic  follow  the  respective  papers  of  the  speakers 
This  format  was  adopted  to  keep  the  flow  and  relevance  of  the  discussion  in  the  context 
of  the  papers  of  each  speaker. 

CONCLUSIONS :    Whereas  the  problems  of  geometric  distortion,  resolution,  vignetting, 
amplifier  and  sensor  non-linearities,  and  A/D  converter  errors  are  generally  understood 
and  recognized,  there  are,  in  general,  no  commonly  used  universal  calibration  test 
patterns  or  techniques  to  remove  these  effects  from  the  final  digitized  image  data. 
Every  investigator  makes  the  attempt  to  remove  the  effects  of  the  recording  environment, 
but  there  is  a  need  for  test  patterns  and  calibration  techniques,  and  for  associated 
documentation,  to  provide  better  uniformity  and  higher  quality  in  digitized  imagery. 

Universal  test  patterns  for  automatic  image  pattern  recognition  would  fall  into 
two  classes: 

First,  for  digitizing  systems  working  from  film:  a  checkerboard  pattern  with  sharply 
defined  grid  edges,  with  stepped  and  continuous  gray  scale  portions,  and  with  a  uniform 
gray  border  to  detect  vignetting. 

Second,  for  microscopy:  a  similar  pattern  deposited  on  a  microscope  slide. 

There  was,  in  addition,  discussion  of  calibration  of  ultrasonic  pulse  and  imagery 
systems,  but  no  clear  consensus  appeared  on  calibration  techniques  or  test  objects  or 
images . 
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INSTRUMENT  PARAMETERS  IN  IMAGE  DIGITIZATION 
Wayne  R.  Huelskoetter 
DICOMED  Corporation 

INTRODUCTION 

Defining  an  image  in  digital  form  is  a  relatively  straight  forward  process--or  so  it 
seems  on  the  surface.    The  image  or  picture  is  simply  subdivided  into  a  number  of  points 
commonly  referred  to  as  picture  elements  or  "pixels";  each  pixel  has  a  numerical  number 
assigned  to  it  representing  the  brightness  of  the  image  at  the  pixel  point. 

However,  to  extract  useful  information  from  the  digitized  image,  it  might  be  helpful 
to  know  a  few  parameters  associated  with  the  digitization  process. 

DIGITIZING  PARAMETERS 

Resolution 

The  first  question  usually  asked  relative  to  a  digitized  image  is  "what's  the  resolu- 
tion?"   The  typical  answer  is  "1024  x  1024"  or  "25  microns".    Neither  answer,  of  course, 
defines  the  resolution  of  the  digitized  image.    In  fact,  the  term  "resolution"  is  one  of 
the  most  confusing,  misused  and  misrepresented  parameters  in  digital  image  processing. 
Because  of  the  large  number  of  disciplines  involved  in  the  manufacturing  and  use  of  digitiz- 
ing equipment,  the  variety  of  terminology  of  these  disciplines  compounds  the  confusion. 
The  photographic  expert  feels  comfortable  with  line  pairs/mm  or  microns,  while  the  com- 
puter scientist  prefers  a  nice  binary  number.    The  electronic  engineer  may  prefer  TV  lines/ 
inch,  and  the  optical  engineer  prefers  MTF.    Different  equipment  technologies  make  it  more 
convenient  to  use  one  term  as  opposed  to  another.    For  example,  the  resolution  of  a  micro- 
densitometer  is  easier  to  describe  in  terms  of  aperture  width,  while  with  an  electronic 
scanner  using  a  TV  tube,  image  dissector  tube  or  CRT,  it  is  more  convenient  to  speak  in 
terms  of  TV  lines/inch. 

Rather  than  launching  into  a  lengthy  discussion  of  the  merits  and  problems  associated 
with  the  various  terms  currently  used  to  define  resolution,  let's  take  a  look  at  a  set  of 
parameters  to  be  used  to  define  the  resolution  characteristics  of  a  digitized  image  which 
can  be  applied  regardless  of  the  scanning  or  digitizing  device  employed. 
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1.  Scanning  matrix:  The  number  of  pixels  per  line  and  number  of  lines  per 

image  (i.e.,  1024  x  1024,  480  x  1240,  etc.) 

2.  Pixel  size  and  shape:      The  actual  size  referenced  to  the  film  plane  specified  in 

microns.  The  shape  (circular,  square  or  rectangular)  and 
the  intensity  distribution  should  be  specified. 

3.  Pixel  spacing:  The  distance  between  adjacent  pixel  and  adjacent  lines  as 

measured  between  centers  and  specified  in  microns  or 
millimeters. 

4.  Scan  area:  The  dimensions  of  the  area  scanned  in  millimeters. 
Photometries 

The  most  common  way  of  specifying  the  photometric  characteristic  of  a  pixel,  and  hence 
the  digitized  image,  is  the  number  of  gray  levels;  i.e.  64  (6-bit)  or  256  (8-bit).  While 
these  numbers  are  meaningful,  they  tell  only  part  of  the  story.    It  is  also  necessary  to 
know  if  the  levels  are  transmittance  or  density  levels,  or  some  other  function  of  input 
codes.    Signal  to  noise  (S/N)  ratio  of  the  digitizing  equipment  is  also  an  important  con- 
sideration.   For  example,  one  could  hardly  expect  256  accurate  levels  with  a  signal  to 
noise  ratio  of  100  to  1. 

If  an  electronic  scanner  is  used,  it  may  also  be  necessary  to  know  the  uniformity  of 
the  scanning  system.    This  is  usually  expressed  as  a  percentage.    Therefore,  to  adequately 
describe  the  photometric  characteristics  of  an  image,  the  following  information  should  be 
provided. 

1.  Transmittance  or  density  levels:    i.e.  64  or  256 

2.  Signal  to  noise  ratio  (S/N):  db 

3.  Density  range  of  image:    i.e.  0.02  -  1.5D 

4.  Digitizing  system  uniformity  (if  applicable) 

The  digitization  of  a  color  image  adds  another  parameter  to  be  concerned  about--spectral 
response  of  the  sensor,  the  color  filters,  light  source,  etc.    In  other  words,  it  takes 
more  than  simply  digitizing  the  image  three  times  through  red,  blue  and  green  filters. 
Unless  the  spectral  response  of  the  digitizing  system  is  reasonably  equal,  relative  to  the 
red,  blue  and  green  components,  the  resulting  digitized  image  will  not  be  an  accurate 
representation  of  the  image. 
Geometri  cs 

For  many  applications  of  digital  image  processing,  particularly  those  involving  mea- 
surements, the  geometric  accuracy  of  the  scanning  system  is  important.  Geometric  parame- 
ters of  importance  include: 
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1.  Orthogonality  between  X  and  Y  axis:    specified  in  degrees. 

2.  Rotation  of  scan  lines  to  film. 

3.  Scan  line  curvature:    percentage  deviation  from  a  straight  line. 

4.  Scan  line  linearity:    percentage  deviation  from  an  ideal  pixel  to  pixel  spacing. 

It  is  also  important  to  understand  how  the  various  parameters  are  defined  and  measured 
for  a  particular  device.    Numbers  without  definitions  are  meaningless. 
Other  Considerations 

The  parameters  of  resolution,  photometries  and  geometries  as  described  above  will  ade- 
quately describe  an  image  in  most  cases.    However,  some  applications  may  require  knowledge 
of  the  stability,  repeatability  and  drift  of  the  digitizing  device  used.    These  charac- 
teristics become  very  important  when  digitizing  color  images  since  usually  three  scans  of 
the  image  are  required.    Any  drifting  in  the  spatial  positioning  or  scan  position  due  to 
color  shift  through  the  filter  can  be  significant. 

The  speed  of  the  digitizing  device  is,  or  course,  an  important  throughput  parameter 
for  the  person  doing  the  scanning.    However,  once  the  data  has  been  recorded  on  tape  and 
sent  to  another  person,  the  speed  has  little  significance.    Therefore,  except  to  the 
extent  that  the  scanning  speed  has  a  direct  effect  on  S/N  (in  that  case  the  S/N  ratio  per- 
taining to  the  scan  speed  should  be  referenced),  it  is  not  a  parameter  needed  to  define  the 
digitized  image. 

SUMMARY 

As  we  all  know,  no  digitizing  device  is  perfect.    Each  has  advantages  and  disadvan- 
tages, and  trade-offs  must  be  made  based  on  the  needs  of  the  application.    The  resultant 
digitized  image  will  be  distorted  to  some  degree.    Therefore,  it  is  important  to  understand 
characteristics  of  the  digitized  image  in  terms  understandable  by  all  of  the  recipients  of 
that  digitized  image.    The  parameters  discussed  above  and  summarized  below  should,  in  the 
majority  of  cases,  define  the  image  independent  of  the  scanning  techniques. 

1. Resolution  2. Photometries  3. Geometries 

a.  Scan  matrix                   a.Transmittance  or  density  levels  a. Orthogonality 

b.  Pixel  size  and  shape       b. Signal  to  noise  ratio  b. Rotation 

c.  Pixel  spacing                c. Density  range  of  image  c.Scan  line  curvature 

d.  Scan  area                      d. System  uniformity  d.Scan  line  linearity 

e. Spectral  response  (color) 

Just  as  in  the  evaluation  of  music  where  the  quality  is  determined  by  the  pleasure  of 
the  listener,  the  quality  of  the  digitized  image  will  be  determined  by  how  well  it  fits  the 
needs  of  the  user. 
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Discussion 

Wayne  Huelskoetter 
Diecomed  Corp. 

Instrument  Effects  in  Digitizing  an  Image 

Question:    These  effects  are  difficult  to  measure.    Is  a  universal  test  image  possible? 

Answer:       Ronchi  rulings  are  now  often  used.    A  universal  test  image  may  be  possible. 

Comment:      In  digitizing  photographs  using  a  drum  scanner,  sudden  intensity  changes  lead 
to  transient  distortions  because  of  inevitable  instrument  response  time  con- 
stants.   This  implies  a  variable  signal  to  noise  ratio. 

Question:    How  accurately  can  you  return  to  the  original  pixel? 

Answer:  In  electronic  systems,  after  warm  up,  you  can  get  back  pretty  well  (±1  pixel.) 
Question:    What  do  you  mean  by  25  micron  spot  size? 

Answer:  This  is  a  convolution  of  the  instrument  function  with  the  source  function, 
which  is  a  smoothly  peaked  function.  Spot  size  obviously  depends  on  where 
you  make  your  measurements,  for  example,  half  width  at  half  maximum. 

Question:    What  about  modulation  transfer  function  (MTF)? 

Answer:       Can  use  MTF  but  it  doesn't  always  mean  anything.    We  have  tried  to  use  it. 
Comment:     Resolution  and  grey  scale  interact. 
Answer:  Yes. 

Question:    One  must  distinguish  between  resolution  elements  and  pixels.    S/N  ratios 
are  defined  only  in  a  resolution  element.    This  is  particularly  a  problem 
for  opaque  material  because  you  can't  get  opaque  scanners.    How  do  you  define 
total  resolution? 

Answer:       You  can  use  a  test  image. 

Question:    How  much  resolution  do  you  need  in  the  test  image? 

Answer:       That  depends  on  your  need.    You  don't  want  to  overdo  it.    People  don't  analyze 
their  requirements.    The  best  resolution  may  not  imply  the  best  instrument 
for  a  given  application.    For  example,  a  system  with  given  resolution  may 
have  to  run  much  slower  than  another  to  get  the  same  S/N  ratio. 
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CALIBRATION  OF  TELEVISION  MICROSCOPES 


K.  Preston,  Jr. 

Carnegie-Mellon  University 
Pittsburgh,  PA  15213 


1.  INTRODUCTION 

With  the  commercialization  of  clinical 
television  microscopy  as  regards  automatic 
cell  analysis  and  hematology,  about  1  bil- 
lion cell  images  are  digitized  and  processed 
each  year  in  the  United  States.  Proper 
calibration  is  necessary  in  order  to  assure 
the  reliability  and  repeatability  necessary 
to  produce  valid  cell  identifications  in 
these  machines.     In  these  microscopes  image 
information  is  transferred  from  the  object 
(a  cell)  on  the  microscope  slide  via  the 
imaging  optics  into  image  space  using  elec- 
tromagnetic radiation  in  the  visible  portion 
of  the  spectrum.    Calibration  of  this  pro- 
cess must  include  data  on  colorimetry,  reso- 
lution, linearity,  and  dynamic  range.     It  is 
the  purpose  of  this  paper  to  outline  certain 
calibration  techniques  which  are  used  for 
this  purpose, 

2,  SCANNERS  - 

There  are  two  primary  types  of  tele- 
vision microscope  scanners.    One  is  the 
image  plane  scanner;  the  other,  the  flying- 
spot  scanner.     In  the  image  plane  scanner, 
the  light  detector  lies  in  the  plane  which 
is  conjugate  to  the  eyes  of  the  viewer. 
Such  a  configuration  is  easily  instrumented 
using  standard  commercial  microscopes  and 
taking  advantage  of  the  trinocular  tube 
which  is  ordinarily  used  for  photomicrogra- 
phy. 

The  flying-spot  scanner  differs  from 
the  image  plane  scanner  in  that  the  illumi- 
nation source  is  placed  in  what  is  usually 
the  image  plane  of  the  microscope.    The  mi- 
croscope optics  form  an  image  of  this  plane 
onto  the  specimen.    The  size  of  the  source 
is  kept  small  (equal  to  1  picture  element 
in  diameter)  which  leads  to  the  name  "flying- 
spot".    Light  transmitted  through  the  speci- 
men on  the  microscope  slide  is  collected  by 
the  microscope  condenser  optics,  which,  in 
turn,  cause  it  to  impinge  on  the  light  de- 
tector.   Flying-spot  scanners  are  easily 
implemented  using  cathode  ray  tubes  placed 
in  the  image  plane  with  the  illumination 
directed  down  the  trinocular  to  the  specimen. 
An  excellent  survey  paper  describing  such 
scanners  is  provided  by  Mansberg  and  Ohrin- 


ger.  An.  N.Y.  Acad.  Sci.  157.  5-37  (1969). 

Modem  scanners  (especially  commercial 
scanners  for  cytology  automation)  operate 
under  control  of  a  digital  processor.  Also 
included  is  automatic  transport  of  the 
microscope  slide  which,  of  course,  includes 
both  automatic  focusing  and  the  automatic 
location  of  cells  on  the  surface  of  the 
slide  itself.    The  systems  use  closed-loop 
focusing  mechanisms  which  maximize  the  high 
video  frequencies  in  the  television  signal. 
Some  systems  are  capable  of  maintaining 
focus  within  a  few  microinches, 

3,      CALIBRATION  OF  IMAGE  QUALITY 

There  are  four  methods  of  documenting 
the  recording  environment  as  far  as  the  per- 
formance of  the  scanning  mechanisms  is  con- 
cerned:    (1)     Resolotion  calibration,  (2) 
Color  calibration,  (3)    Calibration  of 
linearity,     (4)    Calibration  of  djmamic 
range. 

Resolution  may  be  calibrated  quantita- 
tively by  measuring  the  modulation  transfer 
function  of  the  system  using  either  bar  or 
sinewave  targets  or  by  producing  images  of 
fine  structures  such  as  diatoms.  Unfortu- 
nately, the  modulation  transfer  function 
varies  across  the  field  of  view  and  mechan- 
isms such  as  the  contourgraph  must  be  used 
to  determine  system  performance  across  the 
entire  field  of  view. 

Colorimetric  calibration  is  usually 
performed  in  a  global  fashion  using  a  prism 
or  grating  monochromator.    This  device 
measures  the  overall  spectral  response  in- 
cluding the  spectral  output  of  the  light 
source  and  the  spectral  sensitivity  of  the 
photodetector .     Such  calibration  is  neces- 
sary in  order  to  compensate  for  varying 
levels  of  photon  flux  as  the  illumination 
wavelength  is  changed  or  as  the  color  of 
the  specimen  changes. 

Linearity  and  dynamic  range  are  usually 
calibrated  simultaneously  by  varying  the 
photon  flux  through  the  optical  system  at 
fixed  colors  or  over  the  entire  spectrum. 
In  order  to  simplify  the  calibration  the 
entire  spectrum    white  light    is  frequently 
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used  and  the  mechanism  for  varying  level  is 
the  so-called  "neutral  density  filter".  The 
output  of  the  detector  is  measured  as  a 
function  of  photon  flux.    At  the  same  time 
noise  measurements  are  performed  in  order  to 
determine  the  calibration  uncertainty.  This 
latter  measurement  must  be  accompanied  by  a 
measurement  of  the  bandwidth  of  the  associ- 
ated video  amplifier. 

4.  CONCLUSIONS 

In  order  that  images  digitized  either 
by  flying-spot  or  image  plane  television 
microscope  scanners  may  be  validated  for  use 
in  experiments  in  Image  enhancement,  image 
bandwidth  reduction,  image  transmission,  and 
in  image  mensuration  for  the  purpose  of  pat- 
tern recognition,  a  set  of  standardized 
tests  should  be  established  which  will  de- 
fine the  recording  environment  in  as  quanti- 
tative a  manner  as  possible.    The  spectral 
response  of  the  system  should  be  determined 
in  a  relative  manner  either  using  standard 
color  filters  or  a  monochromater  of  a  stan- 
dard slit  width  (expressed  in  equivalent 
nanometers) .    This  will  permit  a  correction 
of  the  stored  image  for  the  spectral  sensi- 
tivity of  the  scanning  detector  as  well  as 
the  spectral  emissivity  of  the  associated 
light  source. 

In  addition  it  is  necessary  to  cali- 


brate resolution  and  modulation  transfer 
function,  both  of  which  can  be  done  by  the 
simple  expedient  of  scanning  a  knife  edge  at 
different  parts  of  the  field  or  by  producing 
a  contourgraph.     The  Laplace  transform  or  a 
Fourier  analysis  of  this  step-function  re- 
sponse is  all  that  is  required  to  determine 
the  MTF.     Lastly,  dynamic  range  and  linear- 
ity may  be  determined  using  an  appropriate 
set  of  neutral  density  filters. 

Periodic  calibration  is  necessary  (ide- 
ally before  and  after  image  digitization)  so 
that  the  images  archived  may  be  standard- 
ized.   Furthermore  it  might  be  opportune  to 
use  standard  color  images  over  standard 
fields  of  view  in  order  to  check  noise  on 
the  picture  element  by  picture  element 
value.    This,  however,  is  difficult  in  that 
it  presents  severe  alignment  problems. 

It  is  hoped  that  this  short  description 
of  calibration  problems  and  some  of  their 
solutions  will  be  found  useful  in  establish- 
ing a  television  microscope  image  data  base 
for  use  in  picture  processing  and  pattern 
recognition  not  only  for  cell  images  but 
also  for  use  in  other  types  of  imaging  via 
the  television  microscope  scanner. 


57 


Discussion 

Ken  Preston 
Carnegie  Mellon  University 

Calibration  in  TV  Microscopy 


Question:    What  can  be  done  to  achieve  selective  collection  of  data? 

Answer:       That  depends  on  the  example.    For  SEOS  only  1%  of  the  data  is  analyzed. 

This  is  an  excellent  example  of  the  problem.    However,  no  one  feels  burdened 
by  this  unused  data.    On  the  other  hand,  blood  cell  analysis  is  good:  the 
microscope  looks  at  millions  of  cells  and  outputs  just  five  lines  of  print. 

Comment:     You  need  to  know  a  priori  what  you're  looking  for  to  select  out  desired  data. 

Question:    In  measuring  linearity,  do  you  want  geometric  or  gray  scale  pattern? 

Answer:       Gray  scale.    A  checkerboard  can  do  both. 

Question:    In  calibrating  these  instruments,  there  is  a  need  for  protecting  the  public. 

Should  the  National  Bureau  of  Standards  do  this? 
Answer:       The  Federal  Drug  Administration  has  this  responsibility  under  the  Medical 

Instruments  Act. 
Question:    Do  you  need  color  in  the  checkerboard  test  image? 
Answer:       No,  a  gray  scale  can  be  used  with  color  filters. 

Comment:     A  physical  object  is  needed  as  a  test  pattern,  not  just  a  specification  of  line 
pairs/rrm. 
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DIGITIZATION  OF  ULTRASONIC   SIGNALS  FOR  BIOMEDICAL  INVESTIGATION 

J.  F.  Greenleaf,  Ph.D. 
Biophysical  Sciences  Unit,  Mayo  Foundation,  Rochester,  Minnesota  55901 

Transmission  images  obtained  using  x-ray  energy  are  analogous  to 
shooting  particles  through  a  mass  of  tissue  and  measuring  the  number  of 
particles    emerging    on  the  other  side.     Acoustic  energy,  on  the  other  hand, 
can  be  considered  to  be  a  mechanical  disturbance  of  the  material  itself 
propagating  through  the  entire  body  under  investigation.     Since  the  mode  of 
propagation  is  the  material  itself,  acoustic  energy  interacts  very  inti- 
mately with  the  material  properties  of  the  tissue  being  imaged.  Such 
strong  interactions  can  sometimes  reveal  subtle  alterations  in  tissue 
characteristics  (1,2). 

We  have  recently  developed  new  techniques  of  imaging  with  acoustic 
energy  which  when  applied  under  highly  controlled  conditions  can  obtain 
images  representing  the  two-dimensional  distribution  of  specific  acoustic 
material  properties  within  tissue.     These  methods  are  especially  applicable 
to  imaging  of  the  material  properties  within  breasts.     Acquisition,  analy- 
sis, image  display ,  and  processing  of  resulting  images  from  such  data 
requires  standardization  of  the  entire  sequence  of  processes. 

My  remarks  here  will  concern  only  the  environment  within  which  the 
data  are  collected  and  not  object  descriptions  or  classification.  Acoustic 
images  obtained  from  digitized  signals  consist  of  values  derived  from 
analysis  of,  or  operations  on,  digitized  signals  obtained  from  acoustic 
transducers  scanned  spatially  over  known  positions  on  a  boundary  around 
some  portion  of  the  object  of  interest.     Elements  of  the  set  of  activities 
required  to  obtain  data  necessary  for  acoustic  imaging  are:     1)  acquisition 
of  the  fundamental  acoustic  signals,     2)  digitization  of  the  signal, 
3)  measurement  of  information  concerning  the  transducer  characteristics, 
motion,  and  associated  lenses,  and     4)   performance  of  the  analysis  tech- 
niques utilized  to  map  the  raw  data  into  the  final  image  or  images. 
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Acoustic  signals  received  from  transducers  represent  pressure  amplitude 
as  a  function  of  time.     Very  often,   in  addition  to  the  amplitude  of  the  re- 
ceived signal,  one  must  also  know  the  time  at  which  the  transmitter  v/as 
pulsed.     Transmission  systems  require,  in  general,  arrival  time  information 
and  are  usually  range  gated  so  that  only  a  portion  of  the  signal  time  is 
utilized  for  subsequent  analysis.     On  the  other  hand,   echo  systems  can  re- 
quire digitization  of  the  acoustic  amplitude  signal  for  a  longer  duration 
of  time  which  v;ill  include  all  of  the  echoes  returning  from  the  region  of 
interest. 

The  stored  data  are  the  digitized  version  of  the  raw  signal  or  some 
other  function  of  the  received  amplitude  signal  which  can  represent  actual 
amplitude  digitized  at  a  sam.ple  rate  high  enough  to  include  all  of  the 
expected  frequencies  in  the  signal  or  it  can  include  rectified,  clipped  or 
thresholded  versions  of  the  received  signal.     The  digitized  value  can  also 
represent  simply  the  time  of  arrival  of  the  pulse  or  the  amplitude  of  the 
first  pulse  received.     An  example  of  a  system,  for  acquisition  of  acoustic 
signals  from  tissue  within  a  sample  holder  is  shown  in  Figure  1. 
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Figure  1     Computer-based  system  for  acquisition,   analysis,   and  display  of 
ultrasound  signals.     Tissue  samples,  held  between  lucite  plates 
are  scanned  with  a  computer-controlled  scanner.     Computer  triggers 
transmit  pulse  and  controls  parameters  of  A/D  conversion  of 
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received  signal    (i.e.,  voltage  range,   sample  rate,  number  of 
samples,   etc).     Digitized  pulses  from  each  position  on  120  x  100 
point  grid  are  stored  on  digital  magnetic  tape  for  later  display 
in  B-,   C-,   or  A-scan  modes  on  gray  level  video  display.  System 
is  controlled  through  remote  terminal. 

In  addition  to  the  digitized  version  of  the  signal,  one  must  also 
record,  or  understand  by  convention,   information  concerning  the  transducer 
position  and  characteristics .    Acoustic  signals  to  be  utilized  for  obtaining 
multi-dimensional  images  very  often  require  extensive  information  concerning 
coordinates  of  the  transducers  which  were  extant  during  the  transmitted 
pulse.     Not  only  the  three-dimensional  position  of  the  transducer  but  per- 
haps its  orientation  in  space  also  may  be  required.     Scaler  calibration  of 
the  coordinate  system  can  be  included  externally,   that  is,  units  of  milli- 
meters per  pulse  or  internally  with  the  use  of  ticks  or  internal  fiducial 
marks  within  the  image.     Included  in  the  required  information  are  the  trans- 
ducer size,  frequency  response  and  whatever  lenses  miay  be  utilized.  Under 
certain  circumstances,  calibration  of  the  voltage-to-pressure  conversion 
parameters  of  the  transducer  miay  also  be  required. 

The  generation  of  quantitative  images  from  digitized  acoustic  signals 
requires  some  form  of  calibration,  either  external  or  internal  to  the  data, 
in  order  to  convert  the  resultant  pixel  value  used  within  the  image  to 
actual  values  of  material  properties  within  the  imaged  tissue.  External 
calibrations  may  be  included  on  the  tape  header  such  as  the  period  of  delay 
between  the  transmitter  pulse  and  the  digitizing  v/indow.  Calibrations 
internal  to  the  image,   scanned  at  the  same  time  as  the  tissue  under  study, 
may  include  acoustic  data  obtained  from  materials  of  known  acoustic  pro- 
perty which  could  then  be  utilized  for  obtaining  conversion  tables  to 
relate  the  digitized  samples  to  the  corresponding  tissue  properties.  An 
example  of  a    device       for  obtaining  both  spatial  calibration  and  material 
property  calibration  is  shown  in  Figure  2. 
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Figure  2     Example  of  tissue  holder  used  for  acoustic  evaluation  of  tissue. 

Circular  chcimbers  can  be  filled  with  materials  of  known  charac- 
teristics and  scanned  along  with  the  tissue    (transverse  section 
through  canine  heart  shown) .     Subsequent  analysis  can  utilize 
signals  from  knov/n  materials  for  calibration  of  parameters  of 
acquisition  system.     Physical  dimensions  can  be  determined  by 
using  edges  of  lucite  block  for  fiducial  marks. 
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Discussion 
James  Greenleaf 
Mayo  Clinic 
Images  from  Digitized  Acoustic  Signals 

Question:    What  is  the  quality  of  calibration  in  your  data  library  (for  excised  breasts)? 
Answer:       Very  low.    We  are  just  getting  data  for  developing  pattern  recognition  techniques. 

Will  eventually  do  in  live  breasts. 
Question:    What  is  your  test  object  for  a  chromogram? 

Answer:       Hard  saline,  blood  samples.    These  give  you  0.1%  on  velocity.  Attenuation 
is  qualitative,  not  quantitative.    Defining  resolution  is  also  hard  because 
of  lens  geometry,  etc.    We  want  to  see  1-2  mm  objects  with  v>v^.^. 
Question:    Do  you  do  any  in  vivo  measurements  yet? 
Answer:  No. 

Question:    How  does  the  reconstruction  effect  noise? 

Answer:       You  have  a  convolution  kernel,  so  you  can  unscramble  the  data.    Jitter  in 
arrival  time  results  in  noise  in  the  velocity  data. 
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A  Suggestion  for  the  Calibration  of  Digitized  Imagery 

by 

Werner  Frei 
Department  of  Electrical  Engineering 
University  of  Southern  California 
Los  Angeles,  California  90007 


In  recent  years,  advanced  digital 
image  processing  and  pattern  recognition 
techniques  have  attained  a  high  degree  of 
sophistication.    At  the  same  time,    a  grow- 
ing family  of  electro- optical  image  digiti- 
zers have  appeared  on  the  market,  sup- 
plementing the  experimental  devices 
familiar  to  the  researcheres  in  that  field. 
Such  devices  have  been  naturally  adapted 
to  the  particular  applications  considered, 
and  their  characteristics  are  just  as 
varied  as  the  particular  designs  and  tech- 
nologies employed. 

Thus,  it  is  an  obvious  and  legitimate 
question  to  ask  what  a  given  image  data 
set  precisely  represents,   especially  if 
that  data  set  is  to  become  a  standard  for 
a  certain  research  community.  Ideally, 
one  v'ould  desire  a  noise-free  measure 
of  some  physical  aspect  of  an  object  or 
scene,  as  a  function  of  geometric  coordi- 
nates.    The  measure  could  reflect  for 
example  the  radiant  light  intensity,  the 
reflection  coefficient  or  perhaps  the  x-ray 
absorbtion  factor,   depending  upon  the  ap- 
plication.   In  addition,   the  physical 
quantity  would  ideally  be  band-limited  to 
the  Nyquist  rate,    e.  g.  ,  contain  no  spatial 
frequencies  above  one-half  the  sampling 
frequency.     Finally,  the  sampling  pattern 

would  approximate  a  two-dimensional  array 


of  airac  pulses. 

Clearly,  such  an  ideal  case  is  far 
from  reality,   as  a  number  of  distortions 
and  violations  of  the  sampling  theorem  are 
most  likely  to  occur  in  the  image  acquisi- 
tion and  digitization  process.     The  major 
factors  obviously  depend  upon  the  particu- 
lar devices  used  and  can  be  any  combina- 
tion of  the  following  physical  limitations: 

a)  Optical  degradations  caused  by  the 
front- end  imaging  system,  which 
can  introduce  geometric  distor- 
sions,   aberrations  and  vignetting, 
as  well  as  limit  the  spatial  fre- 
quency content  of  the  image. 

b)  Photographic  non-linearities  and 
film  grain  noise  when  a  photograph 
serves  as  an  intermediate  infor- 
mation storage. 

c)  Aliasing,   which  occurs  when  the 
image  being  scanned  is  insuffi- 
ciently band-limited. 

d)  Other  spatial  frequency  degradations 
depending  upon  non-ideal  scanning 
apertures. 

e)  Inhomogeneous  target  areas  when 
the  image  is  converted  by  tele- 
vision-type or  image  dissector 
analysers. 

f)  Non-linear  electro- optical  transfer 

functions  and  noise. 
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g)  Incorrect  spectral  sensitivities  in 
the  case  of  color. 

h)  Inaccuracies  of  the  A/D  converters 
Unfortunately,   the  above  factors  are  not 
easily  tractable,   even  with  the  help  of 
manufacturer  specifications,  perhaps  be- 
cause they  pertain  to  a  variety  of  fields 
and  technologies,  but  also  because  some 
factors  depend  upon  sometimes  critical 
operator  adjustments. 

It  is  felt  that  a  complete  specification 
of  such  parameters  would  require  exact- 
I    ing  standards  which  may  not  be  practical 
in  view  of  the  widely  diverse  systems  and 
domains  of  application.     Not  only  is  it 
questionable  whether  all  measurements 
can  be  done  on  systems  in  the  field  but 
i    it  also  appears  that  complete  specifica- 
j    tions,  if  available  could  be  just  as  con- 
I    fusing  as  certain  manufacturer's  descript 
ions. 

The  alternative  proposed  here  is  that 
j    a  series  of  images  scanned  on  a  given 


system  at  one  time  be  accompanied  by  one 
or  two  test  images  of  well-defined  targets. 
Such  targets  can  be  easily  designed  to 
evidence  possible  geometric  and  other 
spatial  degradations  (aliasing,  resolution, 
for  example),  in  addition  to  a  provision 
for  grey- scale  verification.     It  is  per- 
fectly conceivable  that  such  test  images 
can  be  used  not  only  to  check  the  data 
but  also  to  restore  the  digital  imagery  by 
well-known  image  processing  techniques. 

An  excellent  example  of  this  practice 
is  offered  by  television  engineering,  which 
has  been  faced  with  similar  concerns  for 
several  decades.     It  is  believed  that  the 
concept  could  be  adopted  easily  in  the  pre- 
sent field,   because  it  provides  an  objec- 
tive frame  of  reference  at  a  very  small 
cost  and  without  the  prerequisite  of  a 
very  detailed  knowledge  of  the  physics 
and  technologies  relevant  to  the  hardware 
configurations  involved. 
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Discussion 


Werner  Frei 
University  of  Southern  California 


A  Suggestion  for  Calibration  of  Digital  Imagery 


Question:    How  about  direct  digitization  of  video  tape  or  radio  or  TV  signals? 

Answer:       You  can  start  at  the  amplifier.    There  are  electronic  test  signals  for  calibra- 


Question:  Would  you  clarify  the  requirement  for  4  extra  bits  for  10%  accuracy? 

Answer:  I  can  send  you  the  reference. 

Comment:  You  need  at  least  2  extra  bits. 

Question:  Why  can't  you  make  the  DC  knob  disappear  by  setting  the  zero  level? 

Answer:  You  can.    You  can  avoid  all  of  these  problems  if  you  are  careful.    It  is 


tion  from  this  point. 


careless  operators  that  make  problems. 


For  example,  consider  these  problems  as  histograms: 


Number  of 
recorded  points 


Gain  too  low 


Intensity 


Number  of 
recorded  points 


— ^ 
Intensity 


Gain  too  high 
(cl ipped) 


Number  of 
recorded  points 


DC  setting 


Intensity 
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Factors  Which  Influence  Acoustic  Images  of  Medical  Objects 


J.  K.  Zieniuk 


Institute  of  Fundamental  Technical  Problems 
Polish  Academy  of  Sciences,  Warsaw,  Poland 
Visiting  Scientist  at  Bureau  of  Radiological  Health 
Rockville,  Maryland  20852 


The  ultimate  goal  of  ultrasonic  imaging  is  to  obtain  a  two  dimensional  representation 
of  the  anatomical  region  of  interest  containing  as  much  information  as  possible.    Many  factors 
influence  this  image.    Because  this  is  a  very  broad  and  complicated  topic,  I  will  confine  the 
discussion  mainly  to  the  physical  factors. 

Obviously  any  means  of  storing  or  manipulating  the  ultrasonic  image  should  minimize  the 
loss  of  pertinent  information.    I  shall  try  to  suggest  factors  relevant  to  this. 

It  is  these  factors  that  will  be  important  in  documentation  of  an  imaging  system.  Any 
ultrasonic  visualization  system  uses  a  transducer  which  emits  a  highly  coherent  beam  of 
ultrasonic  energy.    Because  of  this  high  degree  of  coherence  any  discussion  of  such  a  system 
should  adopt  a  Fourier  optics  approach.    One  should  keep  in  mind  however,  the  important  dif- 
ferences i.e.,  in  the  acoustics  case  because  of  the  longer  wave  lengths  we  encounter  a  dif- 
ferent ratio  of  object  dimensions  to  wave  lengths. 

In  the  case  of  a  coherent  system  using  acoustics  optic,  the  limit  of  resolution  is  a 
function  of  the  F-number  of  a  lens  and  wavelength,  A,  of  the  ultrasound  wave.    It  usually 
is  defined  as  0.61AF,  where  F  is  the  ratio  of  focal  length  to  the  lens  diameter.    This  value 
depends  on  the  phase  difference  of  the  two  waves  emitted  by  two  point  sources  which  are  to 
I  be  distinguished.    But  in  any  case  the  value  given  here  is  the  smallest  one.    In  the  case 
of  a  scanning  system  using  a  single  transducer  or  an  array  without  any  focusing  devices  the 
dimensions  of  an  elementary  spot  depends  mainly  on  the  directional  characteristics  of  the 
I  transducer(s )  and  the  distance  between  the  transducer  and  the  receiver.    Because  of  diffraction 
i effects  which  in  the  case  of  the  coherent  wave  plays  a  major  role  the  dimensions  of  the  ele- 
mentary spot  will  always  be  bigger  than  that  obtained  with  the  focusing  device. 

The  amount  of  information  contained  in  the  image  depends  on  the  spatial  frequency 
range  of  the  image.    The  cutoff  frequency  is  equal  to  L/2Ad  where  L  is  the  diameter  of  the 
aperture  (square  or  circle),  A  is  the  ultrasonic  wavelength  and  d  is  the  distance  between 
the  aperture  and  the  image  plane. 
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During  the  storage  process  the  image  area  has  to  be  divided  into  elementary  areas  over 

which  the  amplitude  is  averaged.    The  sampling  theory  gives  us  information  as  to  how  large 

an  elementary  area  may  be  and  still  preserve   all  information  contained  in  the  image.  This 

theory  states  that  if  f^^     is  the  highest  spatial  frequency  encountered  and  L  is  the  linear 

max 

dimension  of  the  image,  the  image  should  be  sampled  4(f^^  L)    times.    This  forces  the  linear 

ma  X 

dimension  of  the  elementary  area  to  be  no  larger  than  V2f^^^. 

2 

Therefore  the  optimum  storage  of  an  ultrasonic  image  will  occur  if  we  have  ^(f^g^L) 
sample  points.    Since  the  dynamic  range  of  the  amplitudes  in  the  case  of  medical  applications 
can  extend  over  a  100  dB  range,  it  is  important  that  the  mode  of  display,  storage  or  sampling 
device  have  as  large  a  dynamic  range  as  possible  to  avoid  losing  the  information.    In  our 
discussion  we  have  not  yet  taken  into  consideration  the  measurement  time.    For  medical  uses 
it  is  important  to  be  able  to  perform  investigations  under  dynamic  conditions.    Some  existing 
devices  are  capable  of  forming  images  in  1/100  seconds.    Storage  devices  should  be  able  to 
store  the  above  mentioned  information  in  less  than  1/100  seconds. 

In  the   preceding  discussion  we  have  made  the  assumption  that  we  wish  to  store  a  pre- 
viously formed  image,  i.e.,  that  formed  on  a  cathode  ray  tube.    The  quality  of  this  stored 
image  is  influenced  by  characteristics  of  the  receiver  as  well  as  the  storage  devices.  In 
some  cases  the  receiver  may  degrade  the  image  more  than  the  storage  device.    In  this  situa- 
tion it  is  desirable  to  feed  the  electronic  signals  to  the  receiver  in  the  storage  system 
in  parallel  mode. 

The  number  and  choice  of  the  parameters  which  should  be  stored  along  with  the  image 
depends  upon  many  factors.    In  routine  work  with  one  particular  system  it  would  be  enough 
to  record  the  intensity  of  the  emitted  wave,  pulse  length  and  repetition  rate  as  well  as 
electronic  parameters  which  could  be  changed  during  clinical  investigations.    Also  medical 
parameters  should  be  recorded.    In  the  case  of  research  and  development  work  many  more  system 
characteristics  should  be  recorded.    These  should  include  all  parameters  influencing  spatial 
frequency  band,  and  amplitude  or  intensity  distribution  over  the  image  surface. 
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Discussion 

Jersy  Zieniuk 
Bureau  of  Radiological  Health 

Factors  which  Influence  Acoustic  Imagery  of  Medical  Objects 

Question:    What  kind  of  a  test  object  is  used  for  ultrasonics? 
Answer:       Tried  a  test  with  wires.    This  worked  O.K. 
Question:    How  about  velocity  calibration? 

Answer:       We  are  not  looking  at  a  volume,  only  at  a  pellicle,  so  we  only  have  relative 

amplitudes  to  consider. 
Question:    What  kind  of  range,  that  is,  distance  from  the  object  to  the  lens,  do  you 

have? 

Answer:       2  cm. 

Comment:     Range  depends  on  frequency.    30cm  at  3MHZ  and,  for  reflection,  can  go  up  to 
lOMHZ. 
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Documentation  of  the  Recording  Environment 
by 

M.  Ritter  and  M.  S.  Maxwell 


The  panel's  topic,  "Documentation  of  the  Recording  Environment,"  covers  the 
range  of  sources  that,  in  remote  sensing  from  satellites,  distort  and  degrade  the 
informational  content  of  the  original  scene.    These  sources  are  numerous  and  highly 
variable  in  their  effects,  and  are  distributed  through  all  phases  of  remote  sensing 
from  the  initial  generation  of  scene  radiances  with  problem  areas  such  as  scene 
"noise"  due  to  variable  canopy  cover  and  bireflectance  properties  in  the  case  of 
agricultural  products  to  data  digitization,  detector  linearity  and  noise. 

In  a  rough  way,  we  may  distribute  the  elements  of  satellite  remote  sensing  into 
several  general  areas: 

Physical  (Phenomenology) 

Scene  generation 

Atmospheric  propagation 
Acquisition 

Sensor/scanning 
Signal  Processing 

Prefiltering 

Sampl ing 

A/D  Conversion 

Digitization 
Data  Storage  and  Handling 

Communication  Links 

Tape  Characteristics 

Reformatting 

Interpolation  Processes 

Information  Extraction 

I  shall  discuss  briefly  two  areas  that  impact  strongly  on  the  remote  sensing 
chain.    The  first  is  compensation  for  the  sensor  system  response,  which  is  relatively 
easy  to  obtain,  and  to  document.    The  second  is  the  impact  of  the  atmosphere  on 
remote  sensing,  which  is  a  more  intractable  problem. 

Any  sensor  degrades  the  radiance  pattern  (and  hence,  the  information  content)  of 
the  scene  under  observation  in  many  ways.    Diffraction  and  aberrations    of  the  optical 
train  spread  the  energy  from  each  point  of  the  scene  into  areas  in  the  image  plane. 
The  size  of  the  detector  elements  and  the  smear  that  results  from  scanning  the  scene 
across  the  detectors  further  spreads  the  image.    This  situation  is  depicted  in  figure 
1.    The  scene  radiance  pattern,  for  simplicity  is  taken  as  a  vertical  square  wave  bar 
pattern,  (a).    The  discrete  distribution  of  points  in  the  image  plane,  (b),  is  the 
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result  of  the  finite  number  of  ray  tracings  from  a  single  point  in  the  field.  Counting 
the  number  of  points  per  unit  interval  in  a  given  direction  yields  a  point  spread 
function  for  that  direction,  (c).    The  interaction  of  the  spread  function  with  the 
scene  radiance  readily  (in  this  case  of  a  repetitive  pattern)  exhibits  its  dependence 
on  the  spatial  frequency  of  the  radiance  distribution,  (d),  (e).    A  graph  of  the 
modulation  function  with  spatial  frequency  is  shown  in  (f).    Representative  modulation 
curves  for  the  detector  and  scanning  smear  factor  are  also  shown  on  this  graph.  The 
composite  system  modulation  transfer  function  (MTF)  for  these  three  factors  is  the 
product  of  their  individual  MTF's  and  is  shown  in  (g).    In  practice,  the  system 
modulation  transfer  function  is  a  measured  curve  and  includes  the  composite  effect  of 
all  system  factors.    An  equivalent  measure  is  the  composite  system  point  spread 
function,  (h). 

The  system  point  spread  function  is  of  direct  interest  to  us  for  this  discussion. 
It  represents  the  fractional  radiance  "poisoning"  each  pixel  gives  and  receives  from 
its  neighbors.    With  this  detailed  knowledge,  the  sensor  data  output  can  be  reprocessed 
or  restored  to  undue  some  or  much  of  the  radiance  poisoning  that  resulted  from  the 
spread  function.    The  degree  of  restoration  is  obviously  signal  to  noise  level 
dependent.    This  type  of  work  is  currently  being  carried  out  by  Bendix  for  NASA  on 
Landsat  imagery  in  an  effort  to  improve  crop  classification  accuracy. 

So  much  for  the  easy  case.    Let  us  turn  to  the  impact  of  the  atmosphere  on  the 
propagation  of  the  scene  radiance  pattern.    We  had  tacitly  implied  that  there  was  no 
atmosphere  in  the  first  case,  that  is,  that  the  scene  radiance  reached  the  sensor 
collector  without  alteration  of  spatial  and  spectral  radiometric  distributions.  In 
actuality,  the  situation  is  complex,  highly  variable,  and  under  certain  conditions  of 
significant  magnitude  in  its  effect  on  the  information  content  of  the  scene.  Elements 
of  atmospheric  scattering  are  shown  in  figure    2  .    Absorptive  processes,  clouds,  and 
cloud  shadowing  are  not  addressed.    The  specific  intensity  of  the  light  at  the  sensor 
depends  not  only  on  the  reflectivity  of  the  object  under  view  in  the  scene  but  also 
the  solar  illumination  at  the  top  of  the  sensible  atmosphere,  the  scatter  attenuation 
of  this  illumination  as  a  function  of  the  incident  and  emergent  optical  path  lengths 
in  the  atmosphere,  the  path  radiance  due  to  atmospheric  scatter,  the  albedo  of  the 
background  and  its  scatter  into  the  sensor's  field  of  view.    Further,  the  scatter 
function  depends  upon  the  aerosol  content  of  the  atmosphere. 

To  a  first  approximation,  ignoring  inhomogeneity  and  anisotropy,  two  known 
landmarks  or  training  sites  in  a  total  scene  would  be  sufficient  to  rectify  the 
alterations  due  to  the  atmosphere,  viz. 

The  irradiance  at  the  sensor  may  be  expressed  as: 


where 


is  the  impingent  solar  illumination 


s  the  average  reflectivity  in  band 


the  effective  optical  path  length  for  band 
©     is  the  sensor  zenith  angle 
is  the  path  radiance 


It  thus  follows  that:  U 
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The   .    ^<t>>   are  measured  quantities,  hence,  if  the  are  somehow  known, 

then       /Hq,vis  determinable. 

The  unknown  can  thus  be  inferred  by: 

As  of  now,  no  atmospheric  corrections  are  being  applied  in  an  operational  mode 
to  satellite  imagery.    Research  in  this  area  is  being  conducted  presently  at  the 
NASA/Johnson  Space  Center. 
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Discussion 


Question; 
Answer: 

Question: 

Answer: 

Comment: 

Comment: 
Question 
Answer: 


Milton  Ritter- 
NASA  Goddard  Space  Fl  Ight  Center 

Sensor  and  Atmospheric  Effects  on  Satellite  Imagery 

In  inverse  filtering,  you  have  a  point  spread  f(T)  which  isn't  constant. 

You  need  at  least  nominal  degradation  and  can  do  better.    T  is  a  second  order 

effect. 

Do  you  know  the  spread  function? 
You  know  the  design  spread  function. 

This  has  been  done  for  space  images  and  second  order  corrections  really  are 
second  order. 

You  can  do  second  order  corrections  "on  the  fly". 
How  about  S/N? 

You  can  improve  by  correlation  with  neighboring  points. 
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Correcting  Gray  Scale  Distortions  in  Photographic  Images 

by 

Brent  S.  Baxter 
Computer  Science  Department 

University  of  Utah 
Salt  Lake  City,  Utah  84112 

The  objective  in  digitizing  an  image  is  to  obtain  a  numeric  representation  having 
some  known  relation  to  the  spatial  distribution  of  light  energy  incident  on  some  optical 
surface  such  as  a  camera  film  plane  or  the  retina  of  the  eye.    One  common  relationship 
(and  indeed  a  very  useful  one)  is  simply  to  require  that  the  numeric  data  be  proportional 
to  light  intensity.    Simple  as  this  seems,  the  introduction  of  a  photographic  process 
between  the  original  scene  and  final  numeric  data  requires  a  carefully  applied 
calibration  and  correction  step  to  correct  for  gray  scale  nonl ineari ties  introduced 
by  the  film. 

To  measure  and  correct  for  these  distortions,  a  sample  of  film,  identical  to  the 
one  used  in  the  camera,  is  exposed  to  carefully  measured  amounts  of  light  in  a  sensit- 
ometer  or  other  suitable  instrument.    Figure  1  illustrates  this  situation  schematically. 
Both  films  are  processed  together  in  the  same  chemistry  and  the  density  of  the  test 
film  is  measured  as  shown  in  Figure  2.    Note  that  the  photographic  density*  is  an 
approximately  logarithmic  function  of  exposure.    Note  also  that  substantial  modifications 
to  this  function  are  possible  by  changing  the  development  process.    This  is  the  reason 
both  films  should  be  processed  simultaneously. 
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Discussion 

Brent  Baxter 
University  of  Utah 

Correcting  Gray  Scale  Distortions  in  Digitizing  Photographs 
Question:    There  is  a  problem  of  getting  an  invertable  F. 

Answer:       If  image  has  very  dark  or  very  light  regions,  you  want  a  low  contrast  film 
to  be  able  to  encompass  the  entire  range.    You  can  increase  range  by  cutting 
down  development  time. 

Question:    What  about  resolution  at  high  silver  regions? 

Answer:       You  must  be  able  to  measure  Iq  accurately. 

Comment:     Put  a  wedge  on  known  substance.    (Gives  known  density  on  film  to  automatically 
calibrate  the  densitometer  during  read  out). 


80 


Discussion 

Lt.  Boccia 
Armed  Forces  Institute  for  Pathology 

The  Recording  Environment  in  Pathology 


Question:    Would  you  comment  on  the  data  bank  of  images? 

Answer:       This  is  proposed.    We  are  just  beginning.    We  have  the  instrument:    the  scanner, 
l/2y  stepping  stage,  and  a  small  minicomputer  (16  bit,  32K  core,  20M  bytes  disc) 
It  takes  2  minutes  to  scan  a  cell,  but  this  is  very  flexible.    The  system  looks 
for  a  nucleus,  can  edit  data,  and  can  get  skeleton  overlay. 

Question:    How  do  you  do  calibration? 

Answer:       We  will  need  test  patterns.    Dr.  Bahr  and  electronics  staff  will  do  it. 
Question:    Would  you  be  willing  to  reblock  your  data  to  another  standard  format? 
Answer:  Yes. 
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PROTOTYPE  IMAGES 


Chairpersons:   Theo.  Pavlidis,  Princeton  University 

Roger  N.  Nagel ,  National  Institutes  of  Health 


Panelists:    Jack  Sklansky,  University  of  California,  Irvine 
Sam  Dwyer,  University  of  Missouri 
Fred  Billingsley,  Jet  Propulsion  Lab 
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Panel  on  Prototype  Images:  Overview 
by 

Roger  Nagel 

The  panel  on  prototype  images  was  formed  to  consider  the  goals  and  selection 
criterion  for  a  centralized  data  base  of  prototype  images.    The  panel  addressed  the  topic 
from  several  points  of  view  including  the  philosophy,  practicality,  and  utility  of 
such  an  endeavor. 

In  the  first  three  of  the  panelists'  presentations,  the  design  goals,  practical 
problems,  and  technical  problems  attended  with  the  formulation  of  an  image  data  base 
were  discussed.    These  talks  as  summarized  by  the  panelists  themselves  are  presented  in 
the  text.    The  remaining  two  panelists  described  their  experiences  in  the  actual 
preparation  and  use  of  data  bases  designed  for  specific  application  areas.    Again  these 
presentations  are  presented  in  this  report. 

After  the  individual  talks  and  at  the  conclusion  of  the  presentation  audience 
participation  was  solicited.    While  no  consensus  can  be  claimed  there  was  agreement  on 
several  issues  as  summarized  below: 

On  the  topic  of  should  a  data  base  of  limited  size  be  solicited  and  distributed 
by  a  central  facility,  the  general  sentiment  was  yes,  and  the  suggestion  that  NTIS  or 
NBS  spearhead  the  effort  was  made. 

Two  competing  goals  for  such  a  data  base  seemed  to  divide  the  audience.    Some  see 
:  the  goal  as  facilitating  the  comparison  of  algorithmic  techniques,  i.e.,  texture  measures, 
while  other  feel  that  specific  application  areas  such  as  radiography  should  be  the  main 
selection  goal.    It  was  apparent  to  all  that  no  single  data  base  would  suffice,  and 
that  the  creation  of  several  data  bases  by  parties  with  similar  goals  should  be  encouraged. 
I  It  was  pointed  out  that  several  such  data  bases  currently  exist.    Equally  obvious  is  the 
desire  that  all  such  data  bases  be  in  some  "universal"  format  hopefully  agreed  on  or 
specified  by  this  conference. 

No  matter  what  the  goal  of  a  candidate  data  base,  there  seemed  to  be  agreement  that 
selection  criterion  and  screening  of  data  base  entries  were  important.    It  was  suggested 
I  that  in  formulating  a  candidate  data  base  a  committee  be  appointed  to  screen  and  solicit 


85 


images.     Criteria   should  include  but  not  be  limited  to  images  with  varying  degree  of 
difficulty,  prior  results  on  published  algorithms,  reasonable  sample  sizes,  and  precise 
documentation  on  the  source  and  recording  environment  of  the  imagery. 

It  was  further  pointed  out  that  the  availability  of  prototype  imagery  is  often 
complicated  by  legal  problems  as  in  the  case  of  medical  data,  and  classification 
problems  as  in  the  case  of  intelligence  data.    In  addition  it  is  frequently  difficult 
to  document  the  "ground  truth"  of  the  image  as  human  photointerpreters ,  experts  in  the 
various  fields,  often  disagree. 

Finally,  the  problems  of  differing  image  size,  bits  per  pixel  and  pixel  data  types 
represent  an  additional  source  of  problems  in  a  centralized  data  base.    Images  as  small 
as  256  pixels  to  those  with  millions  of  pixels  are  used  while  individual  pixels  can  be 
binary  or  complex  valued. 

In  summary,  it  was  felt  that  the  problems  to  be  faced  in  a  centralized  image  data 
base  are  formidable,  but  that  practical  data  bases  related  to  specific  goals  could  be 
solicited,  selected,  and  distributed  if  well  defined  goals  and  selection  criteria 
are  employed. 
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PROTOTYPE  IMAGES  -  GENERAL  REMARKS 


by 

Theodosios  Pavlidis 
Dept.  of  Electrical  Engineering  &  Computer  Science 
Princeton  University 
Princeton,   N.J.  08540 


The  major  motivation  for  the 
creation  of  a  data  base  of  Proto- 
type Images  is  that  it  will  allow 
the  direct  comparison  and  evalua- 
tion of  various  methods  in  image 
processing,  pattern  recognition, 
automated  diagnosis  etc.  Although 
pictorial  data  bases  should  meet 
certain  standards  of  documentation 
and  portability,   Prototype  Images 
must  satisfy  a  number  of  additional 
constraints.   The  purpose  of  this 
session  i§  to  make  suggestions  and 
recommendations  in  this  context.  I 
believe  that  a  number  of  considera- 
tions must  be  taken  into  account. 

I.     Purpose:     There  are  two 
major  types  of  work  where  such  a 
data  base  will  be  used,  each  type 
j  having  different  requirements : 
!  (1)  Testing  of  Picture  Process- 

[  ing  Algorithms  (Segmentation, 
! Boundary  Tracing  etc.). 

(2)  Testing  of  Classification 
I  Algorithms . 

I  In  the  second  case  one  will 

1  deal  with  specific  applications  (ejg. 
tumor  detection  in  chest  radiographs). 
In  the  former  there  is  no  such 
I  restriction  and  a  mixture  of  sub- 
!  jects  is  desirable  in  order  to  test 
I  the  generality  of  a  proposed 
1.  algorithm.     In  order  to  offer 
I  statistically  valid  results  the 
'  data  bases  used  for  the  second  pur- 
pose must  be  quite  large. 

^  II.     Major  Features:      (1)   It  is 

j  important  that  the  test  images  have 
1  an  appropriate  degree  of  difficulty 
in  order  to  make  comparisons  mean- 
ingful.     (2)  The  images  should  have 
no  bias  in  favor  of  any  given 
methodology.     For  example  some  edge 
detectors  do  quite  well  along  ver- 
tical or  horizontal  directions  but 


not  along  others.     Thus  a  proper 
test-picture  should  not  show  direc- 
tional preference.      (3)  Precise 
documentation  about  the  recording 
environment,   tape  format  and  de- 
scription of  the  contents  must  be 
provided . 

The  first  two  features  are  of 
special  relevance  for  the  selection 
of  pictures  for  the  testing  of  pro- 
cessing algorithms.     This  fact 
suggests  that  the  use  of  surrogate 
images  should  be  given  serious  con- 
sideration.    The  third  feature  is  of 
particular  relevance  for  the  testing 
of  classification  algorithms  and  a 
number  of  very  challenging  problems 
are  faced  there. 

The  size  of  the  data  base  for 
picture  processing  tests  will  pro- 
bably be  rather  small.     However  the 
opposite  is  true  for  the  other  type. 
Not  only  must  each  set  of  prototype 
images  be  quite  large  but  a,  large 
number  of  them  will  be  necessary  in 
order  to  take  care  of  the  ever  in- 
creasing applications  where  the 
methodology  of  pattern  recognition 
is  applied. 

A  preliminary  effort  in  estab- 
lishing guidelines  for  data  bases 
of  prototype  images  has  been  made 
by  a  task  force    (chaired  by  this 
speaker)  within  the  framework  of  the 
Biomedical  Pattern  recognition  Sub- 
committee   (chaired  by  Professor  J. 
Sklansky)   of  the  IEEE  Computer 
Society  Machine  Intelligence  and 
Pattern  Analysis  Technical  Committee 
(chaired  by  Professor  K.   S.   Fu).  The 
task  force  concluded  that  it  would 
be  very  difficult  to  establish 
precise  guidelines  for  acceptability 
of  prototype  images.     Instead  it 
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suggested  that  a  review  of  a  pro- 
posed data  base  by  three  independent 
referees  be  performed.     The  referees 
would  evaluate  whether  it  is  appro- 
priate to  serve  as  a  set  of  proto- 
type images.     Thus  standards  of 
quality  would  be  maintained  in  a 
similar  manner  as  for  published 
papers . 

However  a  number  of  questions 
remain  open:   the  choice  of  one  or 


more  facilities  where  data  bases 
would  be  submitted  or  which  would 
solicit  data  bases  of  prototype 
images  ;    the  selection  of  referees; 
the  financing  of  the  reviewing 
procedure,  which  can  be  quite  expen- 
sive;   and  the  distribution  mechanism 
It  should  be  noted  that  these  ques- 
tions are  also  in  the  domain  of  the 
IEEE  -  MIPA  Subcommittee  on  Data 
Bases    (chaired  by  Dr.   J.  B.  McFerrani 
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'  Prototype  Images  -  Selection  Problems 

by 

j  Roger  Nagel 

The  formulation  of  an  image  data  base  raises  several  important  con  bent  and  procedural 
questions.    At  this  juncture  it  is  not  at  all  clear  how  the  procedural  questions  of 
i  format  and  documentation  can  be  settled.    However,  given  that  such  questions  can  be 
!  reasonably  answered  (by  other  groups  at  the  meeting)  we  are  left  with  the  problems  of  data 
I  base  collection  and  distribution. 

Implicit  in  a  data  base  is  some  notion  of  content.    Yet  here  again  a  question  of 
!  scope  arises.    For  example,  to  what  audience  will  the  data  base  be  addressed?    If  it  is 
I  assumed  that  the  data  base  is  intended  for  a  comparison  of  algorithms  in  the  technique- 
I  oriented  sense  then  the  ability  to  cover  many  application  areas  is  compromised,  and  vice 
versa.    This  question  is  essential  because  of  the  expected  limit  to  the  size  of  any  data 
I  base  in  order  to  make  distribution  feasible. 

In  Table  1  a  representative  list  of  application  areas  is  presented  and  in  Table  2 
a  sample  listing  of  techniques  is  presented.    Both  the  size  and  incompleteness  of  these 
tables  brings  home  the  point  that  an  exhaustive  inclusion  of  sample  imagery  is  not 
I  practical.    Thus  the  selection  of  prototype  images  presents  a  difficult  problem  simply 

:  in  terms  of  content  criteria. 

I 

Due  to  the  limited  size  of  a  realistic  data  base,  and  the  obvious  fact  that  it  is 

I  impractical  to  cover  all  possible  areas,  the  task  for  which  the  data  base  is  intended 
becomes  a  critical  factor  in  the  selection  of  prototype  images.    If  we  direct  the 

j  data  base  to  the  research  and  development  community,  it  implies  a  technique  orientation. 
Whereas  if  it  is  directed  at  the  industrial  users,  application  areas  are  the  guiding 
criteria  for  content  selection.    My  own  bias  would  be  toward  research  and  development, 

;  and  I  would  choose  those  techniques  which  are  currently  in  the  literature  for  criteria 

}  in  sample  image  selection. 

!' 

A  reasonable  rationale  for  such  a  choice  is  the  flow  of  techniques  from  the  R&D 
I  community  to  industry.    Thus  we  provide  the  test  bed  for  algorithm  development  and 
1  comparison,  while  the  general  utility  in  particular  application  areas  is  not  yet  tested. 
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Even  when  the  content  selection  criteria  are  known,  we  are  still  left  with  the 
question  of  mechanism  for  collection,  and  distribution.    That  is,  how  much  screening 
of  candidate  images  should  be  done  for  documentation,  format,  prior  results,  and  utility? 
Can  or  should  such  a  data  base  be  updated  to  grow  in  number  of  samples?  Furthermore, 
should  results  of  new  algorithms  be  added  to  the  documentation,  as  the  data  base  is  used? 

In  closing  I  put  forward  the  proposal  that  a  limited  data  base  can  be  selected  and 
distributed.    In  the  beginning,  a  modest  set  of  techniques  should  be  selected  as  the 
basis  for  content  selection.    This  proposal  assumes  that  some  form  of  universal  format 
will  be  developed  by  the  other  workshops,  as  well  as  a  minimum  documentation  standard. 
The  distribution  of  such  a  data  base  with  no  updating  facility  is  now  possibly  at  NTIS, 
and  other  sources. 

While  such  a  proposal  is  only  a  first  step,  it  is  within  the  realm  of  possibility 
and  can  be  spearheaded  by  the  National  Bureau  of  Standards. 
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TABLE  I:     APPLICATION  AREAS 


SATELLITE 

RESOURCE  MONITORING 
WEATHER  FORECASTING 
MILITARY  -  TARGET  RECOGNITION 
INTELLIGENCE  RECONAISSENCE 

MEDICAL 

RADIOGRAPH 

RECONSTRUCTION 

CYTOLOGY 

FORENSIC 

FINGERPRINT 
HANDlvTlITING 
FACE  IDENTIFICATION 

INDUSTRIAL 

NON  DESTRUCTIVE  TESTING 
CHARACTER  &  PRINT  READING 
ASSEMBLY  LINE  MONITORING 

ROBOTICS 

CARTOGRAPHY 


TABLE  2:     TECHNIQUE  AREAS 

POINT  OPERATIONS 
NEIGHBORHOOD  OPERATIONS 
TEXTURE 
EDGE  &  LINE 
TRANSFORMATIONS 

REGISTRATION  &  SD-IPLE  COMBINATION 
GEOMETRICAL 

SEGMENTATION,  TRACKING 
QUANTITATIVE  MEASUREMENTS 
CLASSIFICATION 
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Prototype  Radiographs 


by 

Jack  Sklansky 
School  of  Engineering 
University  of  California 
Irvine,  California  92717 

I.  INTRODUCTION 

A  technology  of  computer-aided  image  analysis  currently  is  being  developed  for  two 
major  areas:  a)  remotely  sensed  terrestial  images,  and  b)  biomedical  images.  Within 
biomedical  images,  radiographs  are  receiving  special  attention.    This  can  perhaps  be 
explained  by  the  following  observations: 

1)  In  the  United  States,  approximately  700,000,000  medical  radiographs  are 
analyzed  annually  by  a  population  of  about  9000  board-certified  radiologists. 

2)  Of  these  radiographs,  about  230,000,000  are  chest  radiographs. 

3)  Perhaps  a  thirty  percent  increase  in  the  number  of  radiographs  will  take  place 
in  response  to  the  concern  over  the  increased  costs  imposed  by  malpractice 
suits. 

4)  There  is  a  trend  toward  more  precise  and  careful  documentation  of  diagnoses 
of  radiographs  also  in  response  to  the  threat  of  malpractice  suits. 

5)  The  incidence  of  false  negatives  in  the  detection  of  being  tumors  is  about 
thirty  percent.    Comparable  error  rates  for  other  lesions  seem  likely. 

6)  Among  the  various  forms  of  images  that  reveal  a  patient's  internal  structures 
(e.g.,  X-radiographs  (including  xero-grams),  thermograms,  ultrasonic  scans, 
and  nuclear  images),  X-radiographs  provide  the  greatest  resolution  and  conse- 
quently  has  the  greatest  information  content  per  image. 

The  following  are  a  few  of  the  potential  applications  for  a  technology  of  computer- 
aided  analysis  of  radiographs: 

a)  routine  diagnosis  of  radiographs,  including  report-writing, 

b)  therapy  (e.g.,  radiation  therapy;  surgery), 

c)  mass  screening 

d)  public  health  (e.g.,  correlating  radiographical ly  determined  structures  with 
certain  populations, 
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e)  basic  medical  science  (e.g.,  relating  computer-detected  pictorial  textures 
to  diseases) , 

f)  training  of  radiologists. 

The  development  of  an  image-analysis  technology  needs  standard  or  prototype 
images.    In  this  paper  we  discuss  several  of  the  basic  issues  involved  in  the  construction 
of  a  data  base  of  prototype  radiographs  as  a  means  to  accelerate  the  development  of  an 
effective  technology  of  computer-aided  radiography. 
II.    WHY  DO  WE  NEED  PROTOTYPE  RADIOGRAPHS. 

The  following  are  a  few  of  the  benefits  that  would  be  provided  by  a  readily 
available  set  of  prototype  radiographs. 

1)  An  objective  comparison  of  computer  algorithms  developed  by  different  research 
groups  would  be  greatly  facilitated  by  a  set  of  prototype  radiographs. 

2)  The  acceptance  by  the  medical  profession  of  a  computer  algorithm  for  aiding 
the  diagnosis  of  radiographs  usually  implies  a  change  in  the  routines  by  which 
diagnoses  are  reached.    If  a  convincing  case  for  such  a  change  can  be  obtained 
at  all,  it  will  have  to  be  based  in  part  on  a  test  of  the  algorithm  properly 
selected  on  a  set  of  prototype  radiographs  whose  diagnoses  are  accepted  by 
the  medical  profession  as  valid. 

3)  Occasionally  a  computer  algorithm  may  reveal  a  relation  between  computed 
pictorial  features  and  otherwise  undiscernable  physiological  or  diagnostic 
phenomena  in  the  radiograph.    To  establish  the  validity  and/or  reliability  of 
such  a  relation  will  require  a  set  of  prototype  images  illustrating  the 
phenomenon  of  interest. 

4)  For  many  research  groups,  a  representative  set  of  prototypes  will  elimijiate  or 
postpone  the  purchase,  operation,  and  maintenance  of  expensive  scanning  and 
digitizing  equipment. 

One  cannot  hope  to  build  a  set  of  prototype  radiographs  in  anticipation  of  all 
algorithms  that  will  need  to  be  tested.    The  domain  of  possible  disease  states  and 
possible  physiological  phenomena  is  too  large.    Hence  we  must  a)  construct  a  small 
representative  set  of  radiographs,  and  b)  establish  a  set  of  guidelines  and  criteria 
for  the  selection  and  solicitation  of  prototype  radiographs. 
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III.  EVALUATION  CRITERIA 

Below  we  suggest  a)  the  forms  that  prototype  radiographs  may  take,  and  b)  possible 
criteria  for  determining  the  acceptability  of  candidate  radiograph  for  the  file  of 
radiographs. 

We  suggest  that  a  prototype  radiograph  may  take  one  or  any  combination  of  the 
following  forms. 

a)  A  "raw"  radiograph  --  i.e.,  the  most  direct  highest  quality  means  of  storing 
the  image.  X-radiographs  are  usually  recorded  on  film  or  xerographic  paper. 
In  cases  where  an  image  intensifier  tube  is  coupled  to  a  television  display, 

the  raw  radiograph  would  be  a  video  tape  recording.    In  the  cases  of  computerized 
tomography  and  isotope  scans,  the  most  accurate  storage  technique  is  a  digital 
memory  --  usually  digital  magnetic  tape. 

b)  An  unfiltered  digitized  radiograph.    This  is  usually  obtained  from  the  raw 
radiograph  by  means  of  a  scanning  or  reading  device  and  an  analog-to-digital 
converter,  and  stored  on  digital  magnetic  tape.    Computerized  tomography  and 
isotope  scans  are  conveniently  stored  directly  in  this  form  of  memory. 

c)  A  filtered  digital  radiograph.    This  may  be  obtained  by  compressing  the 
spatial  and  gray  level  digitizations  of  the  unfiltered  radiograph  into  a  form 
(e.g.,  eight  bits  per  pixel,  256  x  256  pixels  per  radiograph)  that  can  be 
accepted  by  most  minicomputer  systems. 

IV.  DOCUMENTATION  OF  PROTOTYPE  RADIOGRAPHS 

Regardless  of  which  of  these  forms  is  used,  each  prototype  radiograph  will  need 
to  be  accompanied  by  substantial  documentation:  the  raw  radiograph  requiring  the  least 
documentation,  and  the  digitized  filtered  radiograph  the  most. 

We  suggest  that  ideally  the  do'^umentation  include  the  following: 
1)    Imaging  system 
Anode  voltage 
Subject-to-film  distance 
Anode-to-film  distance 
Angle  of  the  anode 
Anode  current 
Exposure  time 
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Frequency  spectrum  of  emitted  X-rays 

Dimensions  of  the  Potter-Bucky  diaphragm,  if  used 

Types  and  thicknesses  of  film,  emulsion,  and  fluorescent  screen 

Point  spread  function  as  a  function  of  position  in  the  image  plane 

Film  processing  parameters:  especially  development  time,  chemical  ingredients, 

and  temperature. 

2)  Patient 

Weight,  height,  sex,  age,  occupation, medical  history,  genetic  factors, 
geographic  habitat,  disease  (if  known).    If  a  panel  of  experts  established 
the  disease,  the  members  of  the  panel  should  be  identified  and  the  distribution 
of  votes  for  each  polled  opinion  should  be  described,  without  necessarily 
revealing  how  each  member  voted. 

If  surgical  confirmation  of  the  diagnosis  is  available,  the  basis  for 
this  confirmation  (i.e.,  the  observation  or  the  pathological  test)  should 
be  described. 

Portion  of  patient  viewed  (e.g.,  chest,  lumbar  region,  breast) 
Projection  (e.g.,  posterior-anterior,  lateral) 

Geometric  parameters  (film-to-object  distance,  size  of  object,  etc.) 

3)  Phantom  of  human  structures. 

A  precise  identification  or  description  of  the  phantom  should  be  given.  A 
full  description  of  the  procedure  for  replicating  the  phantom  should  be 
available. 

4)  Test  pattern. 

Two  types  of  test  patterns  for  prototype  radiographs  are  desirable: 
a  step  wedge,  held  at  a  known  distance  from  the  film,  superimposed  on  the  raw 
radiograph;  and  a  three  dimensional  test  object,  exposed  onto  a  separate  film 
without  the  patient,  with  exposure  and  other  imaging  conditions  identical  to 
that  for  the  raw  radiograph. 
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5)  An  additional  documentation  for  unfiltered  digitized  radiograph.  i 
If  the  digitization  was  obtained  from  a  scanning  device,  the  basic  physical 
parameters  of  the  scanner  must  be  documented  in  addition  to  the  documentation 
required  for  raw  radiographs.    These  parameters  include  the  aperture,  the  signal-  [ 
to-noise  level,  the  range  of  linearity  of  the  scanner  pixels,  and  the  number 

of  bits  per  pixel.    If  the  digitization  is  obtained  from  computerized  tomography, 
the  basic  physical  parameters  of  the  tomographic  system  (such  as  the  diameter  I 
of  the  ray,  the  distance  between  adjacent  ray  positions,  and  the  angle 
between  adjacent  projections)  as  well  as  the  reconstruction  algorithm  must 
be  specified. 

6)  Additional  documentation  for  a  filtered  digitized  radiograph. 

I;'  this  form,  the  parameters  of  the  digital  filter  must  be  documented,  in 
addition  to  the  documentation  needed  for  the  raw  radiograph  and  unfil tared 
digitized  radiograph.    For  example,  if  the  filter  operates  on  a  histogram,  the 
histogram  transformation  must  be  described.    The  number  of  bits  per  pixel 
and  the  number  of  pixels  per  unit  length  must  be  specified  for  both  the 
unfiltered  and  the  filtered  digitized  radiographs. 

In  addition,  of  course,  the  recording  parameters  must  be  specified.  For 
details  on  this  subject  see  report  of  session  on  "Standard  Tape  Formats." 

I 

IV.    SOLICITATION  FOR  A  CENTRAL  FILE 

The  prototyper  to  be  solicited  depends  on  the  types  of  algorithms  under  development. 
Currently  we  are  aware  of  research  on  computerized  analysis  of  radiographs  of  the  following 
structures. 

1)    Chest  ; 

a)  ribs 

b)  lung  tissue  (especially  pneumoconiosis) 

c)  lung  tumors 

d)  calcifications 

e)  heart 

f)  pulmonary  blood  vessels 
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2)  Breast 

j  a)  cysts 

I  b)  ducts 

c)  chest  walls 

d)  skin  profile 

e)  carcinomas 

3)  Bone 

a)  trabeculae 

b)  slight  fractures 

c)  spine 

d)  knee 
i            4)  Stomach 

I  5)  Head 

'  In  addition  to  prototype  radiographs  of  human  subjects  (e.g.,  chest  radiographs, 

xeromammograms ,  etc.),  it  will  also  be  useful  to  have  access  to  prototype  radiographs  of 

j     excised  tissue.    Examples  of  such  tissue  are  cancerous  breaiit  tumors,  benign  brea^st  tumors, 
diseased  bone,  and  diseased  lung  tissue.    Radiographs  -  of  these  excised  tissues  will 
provide  data  that  is  not  obscured  by  images  of  tissue  outside  the  concerm  of  the 

1     algorithm  developer. 

In  addition  to  prototype  radiographs  of  human  subjects  and  of  excised  tissue, 
it  will  be  useful  to  have  access  to  prototype  radiographs  of  well  designed  phantoms. 
These  phantoms  fall  in  two  classes:  test  patterns  and  realistic  simulations  of  human 
physiology. 

The  test  patterns  are  needed  in  order  to  provide  a  precise  means  of  modelling  the 
X-ray  imaging  system  and  the  effects  of  the  film  and  Bucky  grid  .     Since  the  imaging 
process  depends  on  the  distribution  of  the  density  of  the  subject  in  3-space,  as  well 
as  on  the  distribution  of  the  focal  spot,  it  is  likely  that  a  wide  variety  of  three- 
dimensional  test  patterns  will  be  designed  and  constructed.    Evaluating  the  ability 
of  various  algorithms  to  model  or  reconstruct  these  test  patterns  may  be  a  useful  early 
step  in  the  development  of  these  algorithms. 
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A  good  example  of  a  phantom  that  simulates  human  physiology  has  been  constructed 
at  the  University  of  California  at  Irvine  under  the  supervision  of  Dr.  E.  N.  C.  Milne  and 
the  technical  assistance  of  W.  Roecke. 

The  phantom  simulates  the  chest:  including  the  ribs,  the  lung,  the  heart, 
pulmonary  vessels,  lung  tumors,  and  calcifications.    This  phantom  yields  exceedingly 
accurate  simulations  of  chest  radiographs,  and  provides  a  means  for  accurate  control  of 
size  and  placement  of  simulated  tumors,  blood  vessels,  etc.    Radiographs  of  such  a 
phantom  can  be  much  more  accurately  documented  than  radiographs  of  a  human  subject,  because 
a)  the  relative  positions  of  all  the  objects  in  the  phantom  are  known  with  great 
accuracy,  b)  the  phantom  doesn't  move  during  the  exposure,  c)  the  phantom  permits  the 
use  of  small -focal -spot  tubes,  with  the  concomitant  high  quality  images  at  the  expense 
of  long  exposures,  and  d)  the  phantom  permits  repeated  exposures  to  provide  original 
radiographs  for  the  file  of  prototypes.    (Original  radiographs  of  human  subjects 
usually  must  be  kept  at  the  originating  clinic.) 
V.      CONCLUDING  REMARK 

Although  the  needs  for  prototypes  and  the  associated  documentation  are  quite 
complex,  modest  beginnings  in  these  directions  can  be  made  that  would  be  very  helpful 
to  researchers  on  computer-aided  radiography.    Examples  of  these  beginnings  are: 

1)  a  set  of  normal  chest  radiographs 

2)  a  set  of  radiographs  of  chest  phantoms 

3)  a  specification  for  a  three-dimensional  test  pattern. 
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Thoughts  on  Standardization  of  Parameters  for  Image  Evaluation 

Fred  C.  Billingsley 
NASA  Headquarters 

It  may  be  anticipated  that  images  will  be  received  for  image  processing  and  analysis 
from  a  wide  variety  of  sources  and  with  a  wide  variety  of  sensors.  Since  it  is  desirable 
to  have  image  processing  algorithms  be  as  universally  applicable  as  possible,  they  should 
be  designed,  where  possible,  to  be  insensitive  to  the  parametric  variations  of  the  source 
material.  Where  this  is  not  possible,  these  variations  must  be  taken  into  account.  We 
therefore  need  to  consider  what  parameters  may  be  defined  in  common  across  a  suite  of  image 
types . 

CONSIDERATIONS  OF  SOURCE  TYPE 

Consider  for  example  the  four  images  in  Figure  1,  derived  from  a  radiograph,  the 
Landsat  satellite,  normal  film  photography,  and  a  scanning  electron  microscope.    The  image 
producing  characteristic  of  these  sources  are  quite  different,  yet  similar  pattern  recogni- 
tion or  other  image  analysis  questions  may  be  asked  of  each. 

We  need  first  to  consider  the  difference  between  image  connotations  and  image  content. 
Image  connotation,  in  this  sense,  is  that  set  of  external  knowledge,  culture,  education, 
or  other  si tuational ly  dependent  information  which  is  used  in  conjunction  with  the  data 
of  the  image  itself.    This  total  set  of  information  is,  in  effect,  a  model  of  the  real  world 
of  which  the  image  is  a  replica,  and  the  problem  is  to  answer  questions  about  the  real 
world  using  data  from  the  image  as  one  source.    Connotation,  like  beauty,  is  in  the  eye 
of  the  beholder.    Therefore,  if  standard  images  are  used  as  surrogates,  a  wide  variety 
must  be  available  so  that  a  suitable  one  for  a  given  problem  may  be  selected. 

To  avoid  these  connotation  problems,  we  need  to  consider  whether  it  is  possible  to 
define  objective  parameters  or  measurements  of  images  which,  in  the  proper  combinations, 
may  serve  as  surrogates  for  "real  images."    These  parameters  may  be  pixel-specific,  location 
dependent,  or  combinations  thereof.    Parameters  which  have  proven  useful  in  defining  the 
characteristics  of  images  include  such  as  the  gray  scale  linearity,  granularity  of  the 
quantization,  spectral  content,  geometrical  fidelity,  resolution  of  the  system  expressed 
as  either  the  point  spread  function  or  the  modulation  transfer  function,  and  the  spatial 
frequency  content  and  characteristics  of  the  data  itself  (which  in  turn  may  contain  various 


amplitudes  and  bandwidth  of  noise)  or  the  statistics  of  the  spatial  variations  of  local 
areas.    In  addition,  we  must  recognize  the  difference  between  intra-pixel  (for  example, 
the  pixel  resolution)  and  inter-pixel  (for  example,  the  recognition  of  multi-pixel  objects) 
effects. 

We  will  consider  here  only  digital  images  which  are  used  as  fodder  for  digital  pro- 
cessing and  will  avoid  entirely  the  question  of  eyeball  analysis  of  visible  images,  a  sub- 
ject which  has  had  considerable  treatment  in  the  literature,  although  not  primarily  from 
the  point  of  view  of  selecting  standard  images.    This  question  of  eyeball  analysis,  however, 
cannot  be  completely  avoided  since  it  serves  as  one  interface  between  the  digital  analysis 
per  se'  and  the  human  verification  of  the  analysis  results. 
PARAMETERS 

We  now  need  to  consider  parameters  which  have  some  relation  to  the  real  world.  Al- 
though photographic  film  is  quite  non-linear  certain  analyses  are  only  tractable  if  the 
brightness/digital  number  transfer  curve  is  assumed  linear.    In  addition,  the  response 
of  the  human  eye  is  approximately  logarithmic  and  therefore  comparison  of  linear  digital 
analysis  with  eyeball  analysis  must  be  carefully  scrutinized.    This  raises  the  first  ques- 
tion which  must  be  considered  in  the  design  of  a  surrogate  image  so  that  the  analysis  may 
relate  to  the  real  world  image  for  which  the  problem  is  being  solved: 

Will  the  true  data  be  recorded  on  film  originally  over  such  a  wide  range  of 
film  densities  that  parts  of  the  image  will  be  recorded  on  the  toe  and  shoulder 
of  the  recording  characteristic  curve  where  the  local  contrast  is  low?  If 
so,  this  must  be  taken  into  account  when  generating  the  surrogate  image. 
There  seems  to  be  no  reason  to  include  gray  scales  of  the  standard  type  in  the  surrogate 
images  to  be  used  for  digital  analysis;  however,  if  included,  they  will  provide  some  measure 
of  control  of  the  eventual  reproducing  process  used  to  display  the  processed  image. 

Also,  at  least  for  "first  order"  reference  images  we  will  ignore  the  possibilities 
of  coma,  penumbra  effects  and  the  like,  thus  again  opening  the  possibility  of  non-surrogate 
analysis.    We  will  assume  that  the  system  of  interest  is  spatially  stationary.    In  the 
same  sense,  we  will  define  that  the  system  is  geometrically  stationary,  i.e.,  rubber  sheet 
distortions  internal  to  the  image  will  be  ignored  (again,  unless  specific  situations  re- 
quire the  reintroduction  of  this  factor). 
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These  assumptions  will  allow  us  to  generate  surrogate  images  with  known  and  predefined 
properties  instead  of  requiring  (or  allowing)  us  to  select  images  from  an  external  library 
and  being  required  to  analyze  them  to  determine  the  properties.    Other  properties  which 
cannot  so  easily  be  dismissed  now  need  to  be  considered  in  some  detail. 
QUANTIZATION  GRANULARITY 

The  number  of  gray  levels  used  for  quantizing  a  continuum  image  into  discrete  levels 
will  have  definite  visible  effects  even  at  the  five  or  six  bit  quantization  accuracy  point, 
and  may  affect  the  processing  even  when  more  levels  are  used.    This  leads  us  to  questions 
of  the  following  type: 

What  is  the  sensitivity  of  the  processing  to  a  various  number  of  digital 
levels?    What  about  truncation  in  the  digital  processing?    Is  there  an 
implicit  level  of  quantization  in  the  data  or  the  sensor  itself?    What  are  the 
noise  characteristics  being  simulated?    How  close  a  separation  of  different 
gray  levels  must  be  detected?    Suppose  they  are  adjacent  or  not  adjacent? 

NOISE 

In  the  absence  of  noise,  a  clean  signal  will  be  recorded  as  the  same  digital  number 
every  time  it  is  quantized.    In  the  presence  of  noise,  however,  there  is  a  finite  probabil- 
ity that  the  digital  number  assigned  to  the  signal  will  be  different  than  that  which  would 
have  been  assigned  to  the  clean  signal.    This  effect  is  illustrated  in  Figure  2  and  curves 
pertaining  to  two  commonly  asked  questions  are  shown  in  Figure  3. 

Noise  will  also  perturb  other  types  of  processing,  especially  that  which  needs  uniform 
areas  or  clean  edges.    This  leads  to: 

In  the  true  scene,  what  is  the  variability  in  nominally  uniform  areas  (in 
magnitude  and  in  the  special  frequency  content  of  the  variation)?  What 
is  the  sensor  noise  and  its  bandwidth?    What  data  questions  (see  Figure  3) 
are  being  asked?    Does  noise  vary  with  brightness? 
SPECTRAL  CONTENT 

The  spectral  content  may  have  as  few  as  one  dimension  (radiographs,  monochrome  pictures) 
to  three  dimensions  (typical  color  film)  to  four  (Landsat)  to  24  (the  NASA  multispectral 
aircraft  scanner).    In  addition,  correlated  mul titemporal  images  may  effectively  have 
even  more.    Pattern  recognition  and  processing  in  the  multispectral  domain  is  very  effective 
where  available.    However,  in  setting  up  surrogates  for  this  situation,  the  covariance 
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between  spectral  bands  which  tends  to  occur,  the  tendency  for  the  same  material  to  usually 
have  the  same  spectral  content  (in  the  visible  sense,  color)  and  the  degree  to  which  given 
materials  aggregate  into  more  or  less  uniform  areas  in  the  picture  must  be  taken  into  ac- 
count.   Pattern  recognition  techniques  for  use  in  this  situation  (e.g.,  Landsat  analyses) 
will  optimally  consider  both  the  tendency  of  a  given  material  to  aggregate  in  patches  in 
the  image  and  also  to  aggregate  statistically  in  multi-dimensional  spectral  space.  Thus 
typical  questions  to  be  answered  might  be: 

What  is  the  typical  multi spectral  distribution  expected?    Are  the  clusters 
spherical  or  elliptical?    i.e.,  what  is  the  interband  correlation?  What 
is  the  cluster  spread,  especially  as  compared  to  the  typical  intercluster 
distance?    What  is  the  typical  aggregation  size  distribution  to  be  expected 
in  the  image  for  materials  of  nominally  uniform  spectral  content?    Can  any 
nonspectral  data  be  mapped  congruent  to  the  real  image  and  be  treated  as 
another  "spectral  band"  in  a  multivariant  analysis? 
POINT  SPREAD  (IMPULSE  RESPONSE)  FUNCTION 

Any  system  upon  viewing  a  delta-function  object  (for  example,  a  star)  will  reproduce 
that  object,  not  as  a  delta  function,  but  rather  that  point  spread  out  in  image  space  around 
the  ideal  delta  function  location.    Since  object  space  may  be  considered  as  composed  of  a 
tightly  packed  array  of  delta  functions  each  with  its  own  amplitude,  the  reproduced  image 
may  be  considered  to  be  composed  of  the  summation  of  the  corresponding  series  of  point 
spread  functions    (psf)  each  at  its  appropriate  location  and  amplitude.    This  situation 
is  sketched  in  Figure  4a. 

The  basis  function  used  to  sample  the  image  at  each  of  the  digital  locations,  com- 
bined with  the  basis  function  used  to  reproduce  it  at  each  of  those  locations,  will  have 
a  filtering  effect  on  the  high  spatial  frequency  content  of  the  image  as  sketched  in  Figure 
4b.    Indeed,  the  Nyquist  criterion  requires  that  this  filtering  function  occur  before  the 
digitization  can  properly  be  made.    This  effect  is  shown  in  the  spatial  frequency  domain 
in  Figure  4d;  the  upper  limit  of  spatial  frequency  content  after  filtering  will  determine 
the  appropriate  digitization  spacing.    The  effect  of  this  filtering  upon  a  sharp  edge  is 
shown  in  Figure  4c.    It  can  be  shown  that  the  edge  softening  caused  by  the  filtering,  when 
combined  with  the  digitization  spacing  appropriate  to  that  same  filtering  function,  will 
result  in  maximumly  obtainable  edge  transition,  from  10%  to  90%,  of  approximately  1.5  pixels. 
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Thus,  any  generated  images  representing  edge  sampling  must  consider  this  effect.    In  par- 
ticular, those  processing  algorithms  which  require  sharp  transitions  for  their  success 
will  be  particularly  affected. 

Thus,  the  type  question  which  we  are  led  to  are: 

What  is  the  spatial  frequency  response  (or  related  to  that,  the  point  spread 
function)  of  the  imaging  system  simulated?    How  does  the  quantization  spacing 
compare  with  that  required  by  the  Nyquist  criterion?    What  is  the  distribution 
of  energy  as  a  function  of  spatial  frequency  of  the  base  band  data  itself? 
What  reproducing  basis  function  will  be  used  to  produce  a  visible  image  upon 
the  completion  of  the  processing? 

GEOMETRICAL  (AREAL  RELATIONSHIPS) 

Binary  Images 

These  include  that  set  of  images  in  which  only  two  values  of  brightness  are  available, 
namely  black  and  white.    These  images  typically  take  the  form  of  line  drawings  of  specific 
objects,  of  maps  (which  display  the  location  of  objects,  but  do  not  represent  the  objects 
themselves),  text,  including  written  music  (in  which  written  characters  of  some  language 
are  used  to  indicate  sounds  of  concepts),  homograms  (from  gram,  a  display,  and  homo,  uniform, 
a  display  in  which  norminally  uniform  or  homogeneous  subject  areas  are  displayed  as  uniform 
or  uniformly  textured  areas  in  the  image),  as  half-tone  images  (in  which  gray  scales  are 
represented  by  a  spatially  varying  binary  pattern).    Figure  5  illustrates  some  of  these. 

Here  we  get  to  the  guts  of  the  pattern  recognition  problem,  and  must  consider  the 
questions  of  the  following  type  in  defining  our  surrogates: 

Is  the  information  to  be  simulated  characterized  by  the  properties  (e.g.,  shapes  and 
sizes)  of  areas  as  defined  by  lines  (the  edges  of  the  areas)?    Or  is  it  in  the 
areal  extent   (again,  size  and  shape)  of  uniform  or  uniformly  textured  areas  in  a 
homogram?    Or  is  it  the  lines  themselves  as  in  text  or  in  map?    How  much  culture 
(as  opposed  to  information  actually  in  the  data  itself)  is  required  for  the  analysis? 
Is  this  apparent  in  the  image  or  introduced  in  the  subsequent  analysis?    Is  the 
analysis  of  a  given  area  dependent  on  or  independent  of  the  analysis  of  and/or 
interrelationship  with  its  neighbors?    Is  orientation,  size,  or  scale  critical? 
Will  analysis  be  by  statistics  or  by  template  matching?    If  the  former,  what  are 
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the  statistics  required  for  the  size  and  shape  of  the  subelements ;  if  the 
latter,  what  are  the  tolerances  allowed  in  the  template?    Is  the  analysis 
dependent  on  texture  (this  includes  not  only  the  visible  texture,  but 
perhaps  the  impression  of  gray  scale  evidenced  as  texture  in  the  half-tone 
image)?    If  the  half-tone  image,  do  we  handle  it  binary-wise  or  do  we 
reduce  resolution  and  average  out  the  dots  and  treat  it  as  a  gray  scale 
image?    Is  edge  sharpness  important?    How  about  affine  transformations, 
such  as  rubber  sheet  stretching? 
With  Gray  Scale 

Again  we  need  to  differentiate  between  information  directly  within  the  image  and  in- 
terpretations based  on  the  image  used  as  a  data  source  to  allow  us  to  solve  an  external 
model  (i.e.,  the  culture  factor  again).    Analysis  of  information  completely  contained  in 
the  image  might  be  characterized  by  (but  obviously  not  limited  to)  automated  clustering, 
which  reduces  the  incoming  image  to  homograms.    For  the  gray  scale  case  we  must  answer 
questions  of  the  same  type  as  proposed  for  the  binary  case,  but  in  addition,  must  ask  such 
questions  as: 

How  shall  we  handle  gradually  shaded  areas  such  as  produced  in  images  of 
cylinders?    How  do  we  define  or  "know"  what  is  represented? 

Again  we  are  faced  with  a  culture  problem  since  the  "know"  is  not  information  in  an 
image.    For  machine  analysis,  this  culture  (i.e.,  the  model  being  solved  using  the  image 
as  a  data  source)  must  be  defined  to  the  computer  in  all  its  painstaking  details.  Then 
for  proper  simulation  the  algorithm  must  exercise  all  of  the  expected  combinations  and  must 
therefore,  include  examples  of  each. 
CONCLUSION 

Thus,  although  surrogate  images  may  be  generated  in  terms  of  quantifiable  and  definable 
parameters  such  as  has  been  outlined  above,  the  set  of  parameters  actually  considered,  and 
the  ranges  over  which  they  must  be  varied,  must  be  defined  in  terms  of  the  model  being 
solved.    In  order  to  properly  exercise  and  test  the  analysis,  the  right  questions  must 
be  asked,  answered  and  simulated.    Simulation  by  generated  images  as  outlined  here,  allows 
the  control  of  various  parameters  which  are  important  and  allows  the  generation  of  surrogate 
images  which  can  be  designed  to  exercise  the  algoritlifis  to  their  utmost. 
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However,  just  as  a  hammer  will  not  turn  screws  (it  is  the  wrong  tool/problem  combina- 
tion), the  wrong  surrogate/analysis  combination  will  be  unproductive  at  best,  or  perhaps 
downright  misleading.    In  this  sense,  although  the  factors  outlined  above  are  considered 
to  be  important  (but  not  necessarily  exhaustive)  in  the  understanding  and  definition  of  the 
problem,  careful  consideration  must  be  given  to  the  entire  problem  situation  to  assure 
that  all  factors  have  been  considered. 


Fig.  1.    Examples  of  images  from  widely  divergent  sources,  a)  Radiograph,  b)  Landsat 
Multispectral  Scanner,  c)  Film  Photography,  d)  Scanning  Electron  Microscope. 
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Fig  2.    Effects  of  noise  on  quantization  accuracy  (sketch). 
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0.1  1.0  10 

p  =  step  size/mns  noise 

Given  a  signal  uniformly  distributed 
over  the  quantization  intervals. 
Given  a  Gaussian  noise  of  value  =a. 
The  curves  show  probability  of  cor- 
rectly assigning  a  digital  value 
corresponding  to  the  noise-free  signal 
within  +0,  ±1,  ...±9  DN  (inclusive) 
as  a  function  of  the  ratio  3  =  step 
size/o. 

Fig.  3.    Effects  of  noise  on  quantization  and  on  ability  to  differentiate  two  areas  of 
different  brightness. 
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Given  two  signals  which  have  been  perturbed 
by  Gaussian  noise  of  value  equal  to  a. 
Each  is  quantized  to  the  same  number  of 
bits.    The  curves  given  the  probability  of 
correctly  determining  the  true  difference 
in  the  two  levels  within  +0,  ±1,  ...±4 
(inclusive)  DN  as  function  of  the  ratio 
B  =  step  size/a. 
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Fig.  4.    Filtering  effect  of  the  psf  a)  Effect  of  finite  psf  on  delta  function,  b)  Effect 

on  a  data  trace,  c)  Effect  on  edge  softening,  d)  Effect  shown  in  spatial  frequency 
domain. 
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Fig.  5.    Illustrations  of  types  of  images  a)  Map,  b)  Line  Drawing,  c)  Homogram,  d)  Text. 
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WORKSHOP  ON  STANDARDS  FOR  IMAGE  PATTERN  RECOGNITION 


June  3-4,  1976 


PROGRAM 


THURSDAY,  JUNE  3,  1976 
8:30  AM  REGISTRATION 
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Ruth  M.  Davis,  Director 
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Russell  A.  Kirsch 
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Roger  N.  Nagel 

National  Institutes  of  Health 
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10:30  AM        STANDARD  TAPE  FORMATS 
Chairpersons : 
John  Dehne,  Army  Night 
Army  Night  Vision  Lab 


William  Alford 

Goddard  Space  Flight  Center 

Prototype  data  formats  will  be  presented  and  di cussed  by  a  mul ti -disci pi i nary 
panel.    A  questionnaire  will  be  distributed  for  completion  by  Workshop 
participants . 


Panel ists : 
Theo.  Pavlidis 
Princeton  University 

John  Sos 

National  Aeronautics  and  Space  Administration 


Sandra  Hawley 
ESL  Incorporated 


12:30  PM  LUNCH 


1:30  PM         IMAGE  CONTENT  AND  STRUCTURE 
Chairpersons: 
Azriel  Rosenfeld 
University  of  Maryland 

Russell  A.  Kirsch 

Applied  Mathematics  Division,  NBS 


To  make  images  maximally  useful  for  interchange,  it  is  necessary  for  the  pro- 
vider of  the  data  to  furnish  descriptions  of  the  image  content.    At  the  simplest 
level,  such  descriptions  refer  to  the  image  as  a  whole.    But  most  interesting 
and  important  image  sources  require  articulated  descriptions  of  the  image 
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structure  giving  the  relations  of  the  parts  of  the  image  as  well  as  their  names. 
Such  image  descriptions  are  useful  not  only  for  image  data  base  management, 
,  but  also  as  standards  for  evaluating  the  success  of  pattern  recognition 

algorithms. 

A  research  problem  that  currently  exists  is  the  development  of  suitable  languages 
in  which  to  embed  descriptions  of  image  content  and  structure.    This  session 
of  the  Workshop  will  be  devoted  to  a  discussion  of  the  status  of  research  and 
related  work  in  graphic  languages  to  provide  the  needed  basis  for  image  content 
and  structure  description. 

Panelists: 
M.  A.  Fischler 
;  Lockheed  Research  Lab 

K.  S.  Fu 
'  Purdue  University 

i  J.  O'Callaghan 

CSIRO,  Australia 

j  R.  F.  Sproull 

Xerox  Corp. 

j 

i  3:30  PM  COFFEE 

!  3:45  PM         IMAGE  CONTENT  AND  STRUCTURE  (CONT.) 
I  5:30  PM         RECEPTION  AND  DINNER 

FRIDAY,  JUNE  4,  1976 

9:00  AM        DOCUMENTATION  OF  THE  RECORDING  ENVIRONMENT 
Chairpersons: 
!  Judith  M.  S.  Prewitt 

!  National  Institute  of  Health 

i 

John  M.  Evans,  Jr. 

I  Office  of  Developmental  Automation  and  Control  Technology,  NBS 

1 

The  content  of  the  original  image  may  be  significantly  altered  in  reducing  it 
I  to  data  on  a  magnetic  tape.    Optical  system  response,  sensor  response,  elec- 

tronic  processing  and  digitization  and  all  will  affect  the  data  recorded. 
I  This  session  will  identify  the  major  problems  and  discuss  the  documentation 

I  needed  to  describe  the  relation  of  the  original  object  to  the  data  recorded 

j  on  tape. 

I  Panelists: 

Wayne  Huelskoetter 
I  Dicomed  Corp. 

Kendall  M.  Preston,  Jr. 
1  Carnegie  Mellon  University 

j  James  Greenleaf 

I  Mayo  Clinic 

Werner  Frei 

I  University  of  Southern  California 

J.  K.  Zieniuk 

Bureau  of  Radiological  Health 


109 


Marvin  Maxwell 

National  Aeronautics  and  Space  Administration 


Brent  Baxter 
University  of  Utah 


Joseph  Boccia,  M.D. 
U.  S.  Army 


10:30  AM 


COFFEE 


10:45  AM 


RECORDING  ENVIRONMENT  (CONT.) 


12:00 


PROTOTYPE  IMAGES 


Chairpersons: 
Theodosius  Pavlidis 
Princeton  University 

Roger  N.  Nagel 

National  Institute  of  Health 

The  selection  and  distribution  of  prototype  images  would  assist  researchers 
and  lead  to  better  intercomparison  of  alternative  approaches.    The  panel  will 
address  the  following  two  issues:    (1)  the  selection  of  criteria  for  the  evalua- 
tion of  prototype  images  and  (2)  the  solicitation  of  data  bases  for  a  centrally 
maintained  file  of  prototype  images. 

Panelists : 
Jack  Sklansky 

University  of  California,  Irvine 
Sam  Dwyer 

University  of  Missouri 

Fred  Billingsley 
Jet  Propulsion  Lab 


12:30  PM 


LUNCH 


1 :30  PM 


PROTOTYPE  IMAGES  (CONT.) 


3:30  PM 


ADJOURN 
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